基于两级存储的正则表达式匹配技术研究
发布时间:2018-03-27 20:13
本文选题:正则表达式 切入点:访问概率 出处:《国防科学技术大学》2013年硕士论文
【摘要】:在网络高速发展的同时,网络的开放性导致的安全问题也日益严峻。深度报文检测是网络安全的核心技术,深度报文检测利用预定义的规则集对报文内容进行匹配,从而识别出隐藏于报文内容中的恶意信息或协议特征。正则表达式匹配是深度报文检测的主要手段,在正则表达式匹配技术中,匹配性能和存储需求是一对相互制约的因素。吉比特网络的快速发展要求骨干网必须具备线速匹配能力,而越来越复杂的规则要求存储器必须具备足够大的容量,但存储器件一般都不同时具备大容量和高吞吐量的特点。这给正则表达式的匹配带来了巨大的挑战,必须探索新的技术以从根本上解决性能与存储的矛盾问题。本文首次提出基于两级存储的匹配技术,一级存储器采用高速的小容量存储器解决性能问题,二级存储器采用大容量的低速存储器解决存储空间需求。通过结合使用两种存储器件,可以用较低的存储代价获得较高的吞吐量。论文主要完成的工作有:1)阐明吞吐量和存储是正则表达式匹配技术的主要矛盾,进而提出基于两级存储的匹配引擎思想。通过仿真匹配实验对状态访问概率进行统计分析,实验表明状态访问概率呈Zipf分布,非常有利于两级存储的架构。2)利用马尔可夫链理论对报文匹配中状态迁移过程进行建模,把稳态向量作为状态理论访问概率。讨论了稳态向量的计算方法,并进行编码实现。实验数据表明该模型与状态访问概率分布特性基本吻和。3)基于开放式网络实验平台Net Magic,实现了本文提出的两级存储匹配引擎,并充分利用FPGA内部多RAM块的特性,实例化多个匹配线程,使系统性能线速提升。实验结果表明,该方法能在保证一定吞吐量的情况下,使存储代价大幅降低。
[Abstract]:With the rapid development of the network, the security problems caused by the openness of the network are becoming more and more serious. Deep message detection is the core technology of network security. The depth message detection uses the predefined rule set to match the message content. In order to identify malicious information or protocol features hidden in message content, regular expression matching is the main means of detecting deep message, in regular expression matching technology, The fast development of gigabit network requires the backbone network to have the ability of line speed matching, and the increasingly complex rules require that the memory must have enough capacity. However, memory devices generally do not have the characteristics of large capacity and high throughput at the same time. This poses a great challenge to the matching of regular expressions. It is necessary to explore new techniques to solve the problem of conflict between performance and storage fundamentally. In this paper, a matching technique based on two-stage storage is proposed for the first time, and a high speed low-capacity memory is used to solve the performance problem. Secondary memory uses a large capacity of low-speed memory to address storage space requirements. By using a combination of two storage devices, The main work done in this paper is to clarify that throughput and storage are the main contradiction of regular expression matching technology. Furthermore, the idea of matching engine based on two-level storage is put forward, and the state access probability is statistically analyzed by simulation matching experiment, which shows that the state access probability is distributed in Zipf. In this paper, the Markov chain theory is used to model the state transition process in message matching, and the steady-state vector is regarded as the access probability of state theory, and the calculation method of steady-state vector is discussed. The experimental data show that this model is based on the open network experiment platform Net Magic. the proposed two-level storage matching engine is realized. Taking full advantage of the characteristics of multiple RAM blocks in FPGA, several matching threads are instantiated to improve the system performance line speed. The experimental results show that the proposed method can reduce the storage cost significantly under certain throughput.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP393.08;TP333
,
本文编号:1673066
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1673066.html