社交网络数据获取与结构分析系统的设计与实现
发布时间:2018-07-29 14:04
【摘要】:Web2.0时代的到来,使得互联网技术朝着更加人性化的方式发展,Twitter、 Facebook、微博、朋友网、人人网等社交软件也随之兴起并飞速发展,目前,人们的日常交流活动基本都是在这些社交软件所提供的平台上进行。人与人之间以这些社交软件为媒介进行有目的的信息交流,从而产生关系网络,这种以人,和人与人之间关系而构成的社会网络结构,称之为社交网络。社交网络的两个结构要素是节点和边,节点一般指人,边是人与人之间的关系。顺应科技发展的需要而产生的科研合作网络,是科研合作的产物,是科研学者之间的社交网络,而科研合著网络又是科研合作网络中由科研学者之间通过合著论文而产生关系从而构成的合著者之间的社交网络。本文研究的对象为社交网络中有代表性的两种网络:微博用户关系网络和科研合著网络,前者是有向网络,后者是无向网络。 社交网络的概念来源于社会学,自提出以来就引起了国内外学者的广泛关注,到目前为止,社交网络的研究热潮仍未退去。网络数据的获取是社交网络研究所要解决的首要问题,但是,大多数已有的关于社交网络的研究,其网络数据来源是公用数据集,或者模拟的网络数据集,这在一定程度上不能准确地反映社交网络结构的真实情况。所以,从互联网上获取真实的社交网络结构数据就显示尤为重要,也使得社交网络的研究成果更加具有实际意义。本文设计的社交网络数据获取与结构分析系统实现了真实数据的获取,分别从新浪微博系统和DBLP数据库中获取真实的新浪微博用户关系数据与合著关系数据。 社会网络分析方法和复杂网络分析方法是被国内外学者广泛认可的两种社交网络结构分析方法。对于科研合著网络来说,分析其网络结构,对促进科研合作的继续发展,预测某一领域的发展方向等具有重要的作用。对于微博用户关系网而言,分析其网络结构,对于市场运营、用户推荐等都有着重要的借鉴意义。本文设计并实现的系统采用社会网络分析方法中的角色分析方法研究科研合著网络结构,对意见领袖和结构洞进行分析研究,采用复杂网络分析方法研究新浪微博用户关系网络的拓扑结构特性。 本文设计并实现了社交网络数据获取与网络结构分析系统,主要工作如下: 1、介绍了本文在设计并实现系统时涉及到的相关概念和技术。 2、设计并实现新浪微博数据获取与网络结构分析功能,使系统可以完成从新浪微博系统中获取真实的用户关系数据,对数据进行去噪处理,并生成关系网络结构图,且采用复杂网络分析方法对网络拓扑结构特性进行分析等一系列工作。 3、设计并实现科研合著网络数据获取与结构分析功能,使系统可以完成从DBLP数据库中获取以“数据挖掘”为主题的四个级别的学术会议收录的论文合著数据,对数据进行处理,生成合著网络结构图,检测出top100个结构洞和意见领袖等功能。 4、以top100个结构洞和意见领袖为研究对象,分别从论文数、citation number、H-index和G-index这四种衡量科研学者学术成就的重要指标进行对比分析。
[Abstract]:The arrival of the Web2.0 era makes Internet technology develop towards a more humanized way. Social software, such as Twitter, Facebook, micro-blog, friend network and Renren network, has also developed and developed rapidly. At present, people's daily communication activities are basically on the platform provided by these social software. The two structural elements of a social network are nodes and sides, the nodes are generally people and the relationship between people and people. They are generated by the needs of the development of science and technology. The research cooperation network is the product of scientific research cooperation, the social network among scientific researchers, and the scientific research collaboration network is the social network between the co authors of scientific research cooperation network which is formed by the co authored papers among the scientific researchers. The object of this paper is the two kinds of representative networks in the social network: micro network. Bo user relationship network and research coauthor network. The former is directed network while the latter is undirected network.
The concept of social network comes from sociology, which has aroused wide attention of scholars at home and abroad since it was put forward. So far, the research upsurge of social network has not been retreated. The acquisition of network data is the primary problem to be solved by social network research institute. However, most of the research on social networks, its network data sources It is a public data set, or a simulated network data set, which can not accurately reflect the real situation of the social network structure. Therefore, it is particularly important to obtain real social network structure data from the Internet, and make the research results of social networks more practical. The social network designed in this paper The data acquisition and structural analysis system realized the acquisition of real data, and obtained real Sina micro-blog user relations data and co authored data from Sina micro-blog system and DBLP database.
The social network analysis method and the complex network analysis method are two social network structure analysis methods widely recognized by the domestic and foreign scholars. For the scientific research collaboration network, the analysis of its network structure plays an important role in promoting the continuous development of scientific research cooperation and predicting the direction of the development of a certain field. For the micro-blog user relations network For the analysis of its network structure, it is of great significance for market operation and user recommendation. The system used in this paper is designed and implemented by the role analysis method in the social network analysis method to study the cooperative network structure, analyze the opinion leader and the structure hole, and study the Sina with complex network analysis method. The topology of the micro-blog user relationship network.
This paper designs and implements a data acquisition and network structure analysis system for social networks. The main tasks are as follows:
1, introduce the related concepts and technologies involved in the design and implementation of the system.
2, the design and implementation of sina micro-blog data acquisition and network structure analysis function, so that the system can complete the real user relationship data from the Sina micro-blog system, denoise the data, and generate a relational network structure diagram, and use the complex network analysis method to analyze the network topology characteristics and so on a series of work.
3, design and implement the function of data acquisition and structure analysis of the joint research network, so that the system can complete the data collected from the four academic conferences of "data mining" from the DBLP database, process the data, generate the co authored network composition, detect the Top100 structure holes and opinion leaders, etc. Function.
4, with Top100 structure holes and opinion leaders as the research object, the paper compares the four important indexes of academic achievements of scientific research scholars from the number of papers, citation number, H-index and G-index, respectively.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.09
本文编号:2152910
[Abstract]:The arrival of the Web2.0 era makes Internet technology develop towards a more humanized way. Social software, such as Twitter, Facebook, micro-blog, friend network and Renren network, has also developed and developed rapidly. At present, people's daily communication activities are basically on the platform provided by these social software. The two structural elements of a social network are nodes and sides, the nodes are generally people and the relationship between people and people. They are generated by the needs of the development of science and technology. The research cooperation network is the product of scientific research cooperation, the social network among scientific researchers, and the scientific research collaboration network is the social network between the co authors of scientific research cooperation network which is formed by the co authored papers among the scientific researchers. The object of this paper is the two kinds of representative networks in the social network: micro network. Bo user relationship network and research coauthor network. The former is directed network while the latter is undirected network.
The concept of social network comes from sociology, which has aroused wide attention of scholars at home and abroad since it was put forward. So far, the research upsurge of social network has not been retreated. The acquisition of network data is the primary problem to be solved by social network research institute. However, most of the research on social networks, its network data sources It is a public data set, or a simulated network data set, which can not accurately reflect the real situation of the social network structure. Therefore, it is particularly important to obtain real social network structure data from the Internet, and make the research results of social networks more practical. The social network designed in this paper The data acquisition and structural analysis system realized the acquisition of real data, and obtained real Sina micro-blog user relations data and co authored data from Sina micro-blog system and DBLP database.
The social network analysis method and the complex network analysis method are two social network structure analysis methods widely recognized by the domestic and foreign scholars. For the scientific research collaboration network, the analysis of its network structure plays an important role in promoting the continuous development of scientific research cooperation and predicting the direction of the development of a certain field. For the micro-blog user relations network For the analysis of its network structure, it is of great significance for market operation and user recommendation. The system used in this paper is designed and implemented by the role analysis method in the social network analysis method to study the cooperative network structure, analyze the opinion leader and the structure hole, and study the Sina with complex network analysis method. The topology of the micro-blog user relationship network.
This paper designs and implements a data acquisition and network structure analysis system for social networks. The main tasks are as follows:
1, introduce the related concepts and technologies involved in the design and implementation of the system.
2, the design and implementation of sina micro-blog data acquisition and network structure analysis function, so that the system can complete the real user relationship data from the Sina micro-blog system, denoise the data, and generate a relational network structure diagram, and use the complex network analysis method to analyze the network topology characteristics and so on a series of work.
3, design and implement the function of data acquisition and structure analysis of the joint research network, so that the system can complete the data collected from the four academic conferences of "data mining" from the DBLP database, process the data, generate the co authored network composition, detect the Top100 structure holes and opinion leaders, etc. Function.
4, with Top100 structure holes and opinion leaders as the research object, the paper compares the four important indexes of academic achievements of scientific research scholars from the number of papers, citation number, H-index and G-index, respectively.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.09
【参考文献】
相关期刊论文 前10条
1 刘志明;刘鲁;;微博网络舆情中的意见领袖识别及分析[J];系统工程;2011年06期
2 孙岩;张楠;;网络拓扑结构研究与分析[J];计算机光盘软件与应用;2013年17期
3 韩家炜,孟小峰,王静,李盛恩;Web挖掘研究[J];计算机研究与发展;2001年04期
4 黄德才;戚华春;;PageRank算法研究[J];计算机工程;2006年04期
5 梁鲁晋;;结构洞理论综述及应用研究探析[J];管理学家(学术版);2011年04期
6 朱庆华;李亮;;社会网络分析法及其在情报学中的应用[J];情报理论与实践;2008年02期
7 张继洋;李宁;;科学合著网络研究进展分析[J];情报理论与实践;2012年04期
8 廉捷;周欣;曹伟;刘云;;新浪微博数据挖掘方案[J];清华大学学报(自然科学版);2011年10期
9 周苗;杨家海;刘洪波;吴建平;;Internet网络拓扑建模[J];软件学报;2009年01期
10 杨波;陈忠;段文奇;;复杂网络幂律函数标度指数的估计与检验[J];上海交通大学学报;2007年07期
,本文编号:2152910
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2152910.html