当前位置:主页 > 科技论文 > 数学论文 >

基于图特征的介度中心近似算法研究

发布时间:2018-04-20 10:00

  本文选题:图特征 + 介度中心 ; 参考:《曲阜师范大学》2015年硕士论文


【摘要】:目前复杂网络的研究领域已经涉及到了计算机科学、生物工程、物理以及城市交通学等各个学科,与人类生活等各个方面的联系越来越紧密。所以复杂网络分析是目前研究的一个热点。在复杂网络分析中一个重要研究方向就是分析网络中节点的重要程度,寻找网络中的关键节点,例如通过控制恐怖组织的领导者可以控制整个恐怖组织,进而避免一些恐怖袭击案件的发生;对网络中的关键服务器加以保护,可以防止其受到病毒或者黑客的攻击,从而达到整个网络正常运行的目的;通过隔离传染源,有效预防传染病毒的传播与扩散;根据人体致命性的蛋白质组成研制出新的药物等等。上述应用中都需要使用的算法是介度中心算法,因此介度中心算法在复杂网络分析中占有重要位置。虽然介度中心算法在实际应用中的使用比较广泛,但是还存在两个问题。首先,介度中心算法的应用场景不确定,目前的衡量网络节点重要程度的算法比较多,还有一些其他常用关键点发现算法,如PageRank算法等,对于何时选择介度中心算法进行关键点发现目前还没有研究,因此确定介度中心算法的应用场景是目前亟需解决的一个问题;其次,介度中心算法的计算量过大,运行时间长,在大数据时代不适用,现在最快的介度中心的算法时间复杂度为O(V*E),大规模的图完成该算法的时间一般比较长,对于一个拥有百万节点的图数据,完成计算就需要十几个月的时间,而且虽然目前的相关工作降低了介度中心算法的运行时间,但是完成一个节点规模超过百万的图的介度中心值计算仍然需要数月的时间,因此介度中心算法在大数据时代不适用。为了确定介度中心算法的应用场景,本文通过比较介度中心算法和PageRank算法得到的关键点集合所处在网络上位置的差异,根据7种网络类型下的15个真实数据集的实验结果分析出了这两种关键点发现算法的应用场景。介度中心算法用于查找整个网络中的重要节点,PageRank算法用于查找某个领域内熟知度比较高的点。为了解决介度中心算法计算量过大的问题,本文通过近似算法的角度来降低介度中心算法的计算量。经研究发现,目前的复杂网络都呈现小世界网络和幂律分布的特点,可以考虑将图本身的特征或者再加上其他的直观图特征与近似算法结合来降低介度中心算法的计算时间,使之能达到实用的程度。本文根据图数据本身的特征提出了一种基于顶点加权的介度中心近似算法,具体做法是选取高影响力的源点与顶点加权的方式来降低介度中心算法的计算量,使用该算法的加速比平均为25,而近似结果的误差率小于0.01%,符合在实际应用中只关心部分重要节点排名的要求。所以,基于顶点加权的介度中心近似算法在保证近似结果精确度的前提下,大大降低了介度中心算法的计算量。
[Abstract]:At present, the research field of complex network has been involved in computer science, biological engineering, physics, urban transportation and other disciplines, and has become more and more closely related to all aspects of human life. Therefore, complex network analysis is a hot research topic at present. An important research direction in complex network analysis is to analyze the importance of the nodes in the network, to find the key nodes in the network, for example, to control the entire terrorist organization by controlling the leader of the terrorist organization. Thus avoiding some terrorist attack cases, protecting the key servers in the network, preventing them from being attacked by viruses or hackers, so as to achieve the purpose of normal operation of the entire network; by isolating the source of infection, To prevent the spread and spread of infectious virus effectively; to develop new drugs according to human lethal protein composition, and so on. The algorithm that needs to be used in the above applications is the meshing center algorithm, so it plays an important role in the analysis of complex networks. Although the dielectric center algorithm is widely used in practical applications, there are still two problems. First of all, the application scene of the mesoscale center algorithm is uncertain, there are many algorithms to measure the importance of network nodes, and there are some other commonly used key point discovery algorithms, such as PageRank algorithm, etc. At present, there is no research on when to select the key point of the algorithm, so it is urgent to solve the problem of determining the application scene of the algorithm. Secondly, the computation of the algorithm is too large and the running time is long. It was not applicable in big data's time. The time complexity of the fastest mesoscale center algorithm is OVS, and it takes a long time to complete the algorithm on a large scale graph, so for a graph with millions of nodes, It takes more than a dozen months to complete the calculation, and although the current work reduces the running time of the mesoscale center algorithm, it still takes months to complete the calculation of the mesoscale center value of a graph with a node size of more than a million. Therefore, the mesoscale center algorithm is not applicable in big data era. In order to determine the application scene of the meshing center algorithm, this paper compares the difference of the location of the key points in the network between the meshing center algorithm and the PageRank algorithm. According to the experimental results of 15 real data sets under 7 kinds of network types, the application scenarios of these two key point discovery algorithms are analyzed. The mesoscale center algorithm is used to find the most important node in the whole network, the PageRank algorithm, which is used to find the point with high degree of familiarity in a certain domain. In order to solve the problem that the computation of the dielectric center algorithm is too large, this paper uses the approximate algorithm to reduce the computational complexity of the dielectric center algorithm. It is found that the current complex networks all present the characteristics of small-world networks and power-law distribution. We can consider combining the characteristics of graph itself or other intuitionistic graph features with approximate algorithms to reduce the computing time of the mesoscale center algorithm. Make it practical. According to the characteristics of graph data, this paper proposes a vertex-weighted permittivity center approximation algorithm. The specific method is to select the high-influence source and vertex weighting method to reduce the computational complexity of the mesoscale center algorithm. The average speedup ratio of the proposed algorithm is 25 and the error rate of the approximate result is less than 0.01 which is in line with the requirement of only paying attention to the ranking of some important nodes in practical application. Therefore, the vertex weighted permittivity center approximation algorithm can greatly reduce the computational complexity of the mesoscale center algorithm on the premise of guaranteeing the accuracy of the approximate results.
【学位授予单位】:曲阜师范大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:O157.5

【参考文献】

相关期刊论文 前1条

1 涂登彪;谭光明;孙凝晖;;无锁同步的细粒度并行介度中心算法[J];软件学报;2011年05期



本文编号:1777308

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/yysx/1777308.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户52498***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com