基于聚类分析的社会网络社团划分方法研究
发布时间:2018-05-07 20:17
本文选题:社会网络 + 社团划分 ; 参考:《南京邮电大学》2015年硕士论文
【摘要】:近年来,随着Web2.0的发展,社会网络越来越受到更多学者们的关注和研究。在社会网络的众多性质中,社团结构是其最重要同时也是最具有研究意义的性质之一。通过社团的划分,我们不仅能够了解和分析网络的结构特点,也能够挖掘出表面数据关系之中的隐性信息。在目前的研究中已经有了很多经典的社团划分算法,比如Girvan-Newman算法,Keinighan-Lin算法等,但是准确度和时间复杂度之间的平衡仍然是社团划分算法的主要问题之一。本文对社会网络的社团划分方法进行了研究,首先系统地分析比较了现有划分算法的优缺点,然后基于聚类分析的思想,提出了两种改进的社团划分方法:基于节点相似度系数聚类的社团划分方法(Community Division Method Based on Node Similarity Clustering,NSCCDM)和基于遗传算法聚类的社团划分方法(Genetic Algorithm Clustering Based Community Division Method,GACCDM)。NSCCDM方法首先通过欧几里得公式和最短路径构造了一种新的相似度系数度量方法,并生成相似度系数矩阵,然后通过比较节点之间的相似度系数,进行节点的聚类,最后再使用模块度函数Q对被重复划分的节点进行最优社团选择。GACCDM方法主要采用了遗传算法的思想。该方法使用了模块度函数Q作为适应度函数来控制遗传算法的进化过程;采用指向性的变异策略代替随机变异策略加快算法的收敛速度;并结合网络的拓扑结构和节点之间的相似度系数关系进行遗传聚类,从而完成对社团的划分。本文使用java语言对以上的两种方法进行编码实现,并将它们分别应用于真实社会网络:Zachary空手道俱乐部网络、海豚网络以及美国足球网络。实验得出的划分结果与实际情况相符,从而表明了本文算法的准确性和可用性。
[Abstract]:In recent years, with the development of Web2.0, more and more scholars pay attention to social network. Among the many properties of social network, community structure is one of the most important and significant properties. Through the division of communities, we can not only understand and analyze the structural characteristics of the network, but also mine the hidden information in the surface data relationship. There have been many classical community partitioning algorithms, such as Girvan-Newman algorithm, Keinighan-Lin algorithm, and so on. However, the balance between accuracy and time complexity is still one of the main problems of community partitioning algorithm. In this paper, the social network community division method is studied. Firstly, the advantages and disadvantages of the existing partitioning algorithms are systematically analyzed and compared, and then based on the idea of clustering analysis, Two improved community classification methods are proposed: community Division Method Based on Node Similarity clustering method based on node similarity coefficient clustering and genetic Algorithm Clustering Based Community Division method based on genetic Algorithm Clustering Based Community Division clustering. Euclidean formula and shortest path construct a new similarity coefficient measurement method. The similarity coefficient matrix is generated, and then the clustering of nodes is carried out by comparing the similarity coefficients between nodes. In the end, the modular degree function Q is used to select the optimal community of the repeated partitioned nodes. The GACCDM method mainly adopts the idea of genetic algorithm (GA). In this method, the modular degree function Q is used as the fitness function to control the evolutionary process of genetic algorithm, and the directivity mutation strategy is used instead of the random mutation strategy to accelerate the convergence of the algorithm. Combining the topological structure of the network and the similarity coefficient relationship between nodes, genetic clustering is carried out to complete the division of communities. In this paper, we use java language to encode the above two methods, and apply them to the real social network:: Zachary karate club network, dolphin network and American football network respectively. The experimental results are consistent with the actual situation, which shows the accuracy and availability of the proposed algorithm.
【学位授予单位】:南京邮电大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP393.09;TP311.13
【参考文献】
相关期刊论文 前1条
1 张慧哲;王坚;;基于初始聚类中心选取的改进FCM聚类算法[J];计算机科学;2009年06期
,本文编号:1858326
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1858326.html