基于部分先验知识的社区发现算法研究
发布时间:2018-01-21 19:52
本文关键词: 社区发现 部分先验知识 标签传播 局部回路 数据集 出处:《天津科技大学》2016年硕士论文 论文类型:学位论文
【摘要】:随着DT(Data Technology)时代的到来,数据的价值在各行各业中越来越得到广泛重视。如何从纷繁复杂的数据中发掘去一些有价值的信息来指导和改善我们的工作和生活具有重要的意义。社区发现是复杂网络研究领域一个重要的研究方向,可以从纷繁复杂的网络数据中寻找一些潜在的社区结构,发现隐藏在网络海量数据中的知识和潜藏在一般现象下的规律,进而为人们提供个性化、科学化的服务,帮助人们作出更有效的决策。本文通过对标签传播算法的研究,结合社区发现过程中的先验知识,提出了一种基于局部回路的标签传播社区发现算法,并通过实验对算法进行了验证。本文的研究工作主要包括以下两个方面:(1)提出了一种基于局部回路的标签传播社区发现算法。首先,综述了社区发现算法,并重点分析了标签传播算法及其存在的问题。其次,根据社区发现过程中节点间存在的先验知识,提出了基于局部回路的标签传播改进算法,即标签传播过程中,当存在多个最大标签值时,采用最短局部回路选择策略代替随机选择,从而有效抑制标签在社区间传播,提高算法的准确度,并用简单示例从理论角度验证了算法的可行性。最后,为了验证改进算法的有效性,本文选择了两种类型的数据集,分别采用经典真实数据集、人工生成基准数据集,并以模块度和NMI为评价标准,用对比的方法对本文提出的改进算法进行验证。实验结果表明基于局部回路的标签传播算法可以取得更好的划分效果。(2)实验验证。选取代表性的微博真实网络为实验数据集,通过预处理剔除特殊点,再将改进算法应用到真实的微博网络的划分中,验证改进的算法在真实网络中也能取到较好的划分结果。
[Abstract]:With the advent of the DT(Data Technology era. The value of data is getting more and more attention in a variety of industries. How to extract valuable information from complex data to guide and improve our work and life is important. Community discovery is. The research field of complex network is an important research direction. We can find some potential community structure from the complicated network data, find the knowledge hidden in the massive network data and the law hidden under the general phenomenon, and then provide individuation for people. Scientific service helps people to make more effective decision. This paper combines the prior knowledge in the process of community discovery through the research of label propagation algorithm. A local loop based label propagation community discovery algorithm is proposed. The research work of this paper mainly includes the following two aspects: 1) A label propagation community discovery algorithm based on local loop is proposed. First of all. This paper summarizes the community discovery algorithm, and analyzes the label propagation algorithm and its existing problems. Secondly, according to the prior knowledge among the nodes in the process of community discovery. An improved label propagation algorithm based on local loop is proposed. In the process of label propagation, when there are multiple maximum label values, the shortest local loop selection strategy is used instead of random selection. In order to effectively suppress the spread of labels in the community, improve the accuracy of the algorithm, and a simple example from the theoretical point of view to verify the feasibility of the algorithm. Finally, in order to verify the effectiveness of the improved algorithm. In this paper, we choose two types of data sets, using classical real data sets, artificial generation of benchmark data sets, and the modular degree and NMI as the evaluation criteria. The experimental results show that the label propagation algorithm based on local loop can achieve better partition effect. Experimental verification. The representative Weibo real network is selected as the experimental data set. The improved algorithm is applied to the partition of real Weibo network by eliminating the special points by preprocessing, and it is verified that the improved algorithm can also obtain better partition results in real network.
【学位授予单位】:天津科技大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP301.6
【参考文献】
相关期刊论文 前3条
1 康旭彬;贾彩燕;;一种改进的标签传播快速社区发现方法[J];合肥工业大学学报(自然科学版);2013年01期
2 赵卓翔;王轶彤;田家堂;周泽学;;社会网络中基于标签传播的社区发现新算法[J];计算机研究与发展;2011年S3期
3 解(亻刍);汪小帆;;复杂网络中的社团结构分析算法研究综述[J];复杂系统与复杂性科学;2005年03期
,本文编号:1452415
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1452415.html