当前位置:主页 > 科技论文 > 搜索引擎论文 >

海量多模式串匹配算法关键技术研究

发布时间:2018-04-16 07:50

  本文选题:海量多模式匹配 + 布尔表达式匹配 ; 参考:《哈尔滨工程大学》2013年硕士论文


【摘要】:经过40余年的发展,字符串匹配已经从单一的单模式串匹配发展成为包含多模式匹配、正则匹配、近似匹配等多个新方向的研究领域。而随着计算机技术、网络技术的发展,社会的进步,字符串匹配也有了越来越广泛的应用场所。在信息检索、入侵检测、网络数据分析等多个领域都能够看到字符串匹配的身影。布尔表达式匹配也有很多的应用,如搜索引擎中的布尔查询,病毒检测系统的特征组合匹配等等。字符串匹配算法以往的研究和实验分析都是在模式集规模在几千到几万的情况下进行的,对于大规模模式集下的算法没有进行过深入的分析。本文主要研究内容分为如下两个部分: 研究在大规模模式集下的多模式匹配问题的特点和难点,提出了基于Wu-manber算法的三种改进算法:基于最短关键字长度哈希的改进算法、多哈希值预检改进算法、成组比较改进算法。 研究布尔表达式匹配问题,提出更泛化的复杂与或布尔表达式匹配问题,,并且针对该问题提出扩展位标记算法。此外本文还提出了布尔表达式化简方法,来进一步减少表达式数目,提高匹配效率。 综上所述,本文在研究和总结现有的模式匹配算法和布尔表达式匹配算法的基础上,重点对大规模模式集下匹配算法、复杂与或布尔表达式匹配和表达式化简进行研究,提出优化方案,并且通过实验来验证算法的可行性以及空间和时间效率。最后本文还展望了该领域的未来发展趋势。
[Abstract]:After more than 40 years' development, string matching has developed from single single pattern string matching to many new research fields, such as multi-pattern matching, regular matching, approximate matching and so on.With the development of computer technology, network technology and society, string matching has been used more and more widely.String matching can be seen in many fields, such as information retrieval, intrusion detection, network data analysis and so on.Boolean expression matching also has many applications, such as Boolean query in search engine, feature combination matching in virus detection system and so on.The previous research and experimental analysis of string matching algorithm are carried out in the case of pattern set size ranging from thousands to tens of thousands, but no in-depth analysis has been done for the algorithm under large-scale pattern set.The main contents of this paper are as follows:This paper studies the characteristics and difficulties of multi-pattern matching problem in large-scale pattern sets, and proposes three improved algorithms based on Wu-manber algorithm: an improved algorithm based on the shortest keyword length hash, an improved multi-hash pre-checking algorithm.The improved algorithm is compared in groups.In this paper, the problem of Boolean expression matching is studied, and a more generalized problem of complex or Boolean expression matching is proposed, and an extended bit marking algorithm is proposed to solve the problem.In addition, a Boolean expression simplification method is proposed to further reduce the number of expressions and improve the matching efficiency.To sum up, on the basis of studying and summarizing the existing pattern matching algorithms and Boolean expression matching algorithms, this paper focuses on the large-scale pattern set matching algorithm, complex or Boolean expression matching and expression simplification.The optimization scheme is proposed, and the feasibility of the algorithm and space and time efficiency are verified by experiments.Finally, the future development trend of this field is prospected.
【学位授予单位】:哈尔滨工程大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP301.6

【参考文献】

相关期刊论文 前7条

1 杨军;邓芳林;;基于Snort入侵检测系统模式匹配改进算法研究[J];计算机安全;2011年06期

2 殷丽华,张冬艳,方滨兴;面向入侵检测的单模式匹配算法性能分析[J];计算机工程与应用;2004年24期

3 曹京;谭建龙;刘萍;郭莉;;布尔表达式匹配问题研究[J];计算机应用研究;2007年09期

4 宋云;龙际珍;;规则数量无关的多布尔表达式匹配算法[J];软件导刊;2012年03期

5 李晓明,凤旺森;两种对URL的散列效果很好的函数[J];软件学报;2004年02期

6 李安怀;荆继武;;网络安全系统中的快速规则匹配[J];计算机工程与设计;2007年06期

7 曹京;刘燕兵;刘萍;谭建龙;郭莉;;定序窗口布尔表达式匹配技术研究[J];通信学报;2007年12期



本文编号:1757934

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1757934.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户bd7eb***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com