基于图编辑距离上界的图相似性判定方法研究
发布时间:2019-01-04 18:58
【摘要】:随着科学技术的飞速发展以及计算机的普及,产生了海量的数据。在网络、数学、生物、化学等领域,数据起着至关重要的作用。图是一种通用且具有广泛研究价值的数据结构,作为一种最直观的表达方式,越来越多的应用依赖于图数据进行表示。图数据的应用变得越来越广泛,其规模迅速增大,同时数据也变得更加多样化和复杂化,如何从海量的图数据中挖掘有效的信息成为研究的核心问题。近年来,我们可以发现在图数据库中存在噪音以及错误的图越来越多,因此学术界愈加关注如何在图数据库中进行相似性搜索,从而得到隐藏的信息。图相似性搜索对于深入研究图数据库起着强烈的推进作用。图编辑距离是一种用来确定图相似性最有效的措施,在模式识别、计算机视觉、生物信息等领域有着非常广泛的应用。图编辑距离作为一种度量两个图之间相似性的有效方法,是不精确图匹配的基础。不精确的图匹配已经成为模式分析领域的重要研究焦点之一,在现实生活中无处不在。当前已经验证了图编辑距离计算是一个NP问题,在通常情况下,图编辑距离计算在多项式时间内无法解决,计算起来十分复杂,浪费了大量的时间,效率低下,所以如何避免过多的图编辑距离计算成为了亟待解决的问题。为此,学术界提出了计算图编辑距离边界的算法,以此来提高计算效率。图编辑距离边界算法虽然可以减少计算量,提高效率,但仍有优化的空间。本文的主要内容如下:(1)简单介绍了几种与图编辑距离相关的算法及各自存在的优缺点。(2)结合现有的图编辑距离边界过滤算法,提出了两种新的方法,分别针对两种不同的情况,对图编辑距离边界过滤算法进行优化,从而进一步减少计算所消耗的时间。(3)建立概率索引图,利用概率索引图找到更好的中介图,进而提出快速收敛方法来解决图相似性问题,进一步减少不必要的图编辑距离计算,达到提高效率的目的。(4)最后验证本文提出的方法,通过在真实数据集与人工数据集上进行实验研究,从各个方面综合分析新方法。结果表明,这些方法具有良好的性能,同时通过实验结果也证实了使用我们提出的新方法来进行图相似性搜索的有效性和可行性。
[Abstract]:With the rapid development of science and technology and the popularization of computer, massive data are produced. Data play an important role in the fields of network, mathematics, biology, chemistry and so on. As a kind of intuitionistic expression, graph is a kind of universal data structure with extensive research value. More and more applications depend on graph data for representation. The application of graph data becomes more and more extensive, its scale increases rapidly, at the same time, the data becomes more diversified and more complicated. How to mine the effective information from the massive graph data becomes the core problem of the research. In recent years, we can find that there are more and more noise and errors in the graph database, so the academic circles pay more attention to how to search the similarity in the graph database to get hidden information. Graph similarity search plays an important role in the further study of graph database. Graph editing distance is the most effective measure to determine the similarity of graphs. It is widely used in pattern recognition, computer vision, biological information and other fields. As an effective method to measure the similarity between two graphs, graph editing distance is the basis of inexact graph matching. Imprecise graph matching has become an important research focus in the field of pattern analysis and is ubiquitous in real life. At present, it has been proved that the calculation of graph edit distance is a NP problem. In general, the calculation of graph edit distance can not be solved in polynomial time. The calculation is very complicated, and it wastes a lot of time and is inefficient. Therefore, how to avoid too much graph editing distance calculation has become an urgent problem. In order to improve the computational efficiency, an algorithm for calculating the distance boundary of graph editing is proposed in this paper. Although graph editing distance boundary algorithm can reduce computation cost and improve efficiency, there is still room for optimization. The main contents of this paper are as follows: (1) several algorithms related to graph editing distance and their advantages and disadvantages are briefly introduced. (2) two new methods are proposed in combination with the existing edge filtering algorithms for graph editing distance. For two different cases, the edge filtering algorithm of graph editing distance is optimized to further reduce the computation time. (3) the probabilistic index graph is built and the better intermediary graph is found by using probabilistic index graph. Furthermore, a fast convergence method is proposed to solve the problem of graph similarity, further reduce the calculation of unnecessary graph editing distance, and achieve the purpose of improving efficiency. (4) finally, the method proposed in this paper is verified. Through the experimental research on real data set and artificial data set, the new method is analyzed synthetically from all aspects. The results show that these methods have good performance, and the effectiveness and feasibility of using the new method to search the similarity of graphs are also verified by the experimental results.
【学位授予单位】:辽宁大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O157.5
本文编号:2400669
[Abstract]:With the rapid development of science and technology and the popularization of computer, massive data are produced. Data play an important role in the fields of network, mathematics, biology, chemistry and so on. As a kind of intuitionistic expression, graph is a kind of universal data structure with extensive research value. More and more applications depend on graph data for representation. The application of graph data becomes more and more extensive, its scale increases rapidly, at the same time, the data becomes more diversified and more complicated. How to mine the effective information from the massive graph data becomes the core problem of the research. In recent years, we can find that there are more and more noise and errors in the graph database, so the academic circles pay more attention to how to search the similarity in the graph database to get hidden information. Graph similarity search plays an important role in the further study of graph database. Graph editing distance is the most effective measure to determine the similarity of graphs. It is widely used in pattern recognition, computer vision, biological information and other fields. As an effective method to measure the similarity between two graphs, graph editing distance is the basis of inexact graph matching. Imprecise graph matching has become an important research focus in the field of pattern analysis and is ubiquitous in real life. At present, it has been proved that the calculation of graph edit distance is a NP problem. In general, the calculation of graph edit distance can not be solved in polynomial time. The calculation is very complicated, and it wastes a lot of time and is inefficient. Therefore, how to avoid too much graph editing distance calculation has become an urgent problem. In order to improve the computational efficiency, an algorithm for calculating the distance boundary of graph editing is proposed in this paper. Although graph editing distance boundary algorithm can reduce computation cost and improve efficiency, there is still room for optimization. The main contents of this paper are as follows: (1) several algorithms related to graph editing distance and their advantages and disadvantages are briefly introduced. (2) two new methods are proposed in combination with the existing edge filtering algorithms for graph editing distance. For two different cases, the edge filtering algorithm of graph editing distance is optimized to further reduce the computation time. (3) the probabilistic index graph is built and the better intermediary graph is found by using probabilistic index graph. Furthermore, a fast convergence method is proposed to solve the problem of graph similarity, further reduce the calculation of unnecessary graph editing distance, and achieve the purpose of improving efficiency. (4) finally, the method proposed in this paper is verified. Through the experimental research on real data set and artificial data set, the new method is analyzed synthetically from all aspects. The results show that these methods have good performance, and the effectiveness and feasibility of using the new method to search the similarity of graphs are also verified by the experimental results.
【学位授予单位】:辽宁大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O157.5
【参考文献】
相关期刊论文 前7条
1 胡敏;程天梅;王晓华;;融合全局和局部特征的人脸识别[J];电子测量与仪器学报;2013年09期
2 曹斌;尹建伟;陈慧蕊;;基于Levenshtein距离的流程检索方法[J];计算机集成制造系统;2012年08期
3 刘华清;陈振东;涂刚;;数据库索引与优化[J];现代计算机(专业版);2012年18期
4 李路;周良;丁秋林;;基于贝叶斯网络的草图符号识别研究[J];计算机科学;2011年06期
5 袁尧;张玉成;董雯霞;郑如松;杨育波;石晶林;;基于二分图匹配的多业务流网络选择机制[J];软件学报;2010年06期
6 肖冰;李洁;高新波;;一种度量图像相似性和计算图编辑距离的新方法[J];电子学报;2009年10期
7 师军,曹菡;启发式搜索问题研究[J];微型机与应用;2004年10期
,本文编号:2400669
本文链接:https://www.wllwen.com/kejilunwen/yysx/2400669.html