当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于微博数据的用户影响力分析研究

发布时间:2018-11-13 14:57
【摘要】:近年来,随着互联网的飞速发展,网络已经成为人们日常生活中获取信息的主要渠道。微博作为近年来快速发展起来的网络新兴媒体,已积累上亿用户。微博平台包含信息量大,信息更新速度快,常常使用户淹没在信息的海洋,帮助用户找到影响力大的用户所发表的微博信息具有重要意义。微博平台推出的检索功能是帮助用户找寻微博信息的良好途径。传统的信息检索包含相关性,权威性,时效性三个关键因素。微博平台由于内容更新快速,发表内容用语不规范,所以时效性和权威性往往具有更加重要的意义。本文的影响力分析也是对权威性的研究。 本文利用微博数据,对用户的影响力进行分析研究,主要成果包括以下内容: 1.微博数据的获取。本文研究初期,从微博平台抓取大量用户数据,包括用户的详细信息,用户关注关系,回复转发关系等。这部分数据是本文研究的基础工作,也可作为微博其他研究的基础数据。 2.本文对于微博用户影响力的研究,目标是识别用户在不同领域的不同影响力。本文从用户发表的微博内容及用户之间的关注关系对微博用户所属领域进行划分,并得出用户在各个领域的权重。通过半自动的标注样本验证,该划分方法具有比较准确的效果。 3.本文在对用户发表的微博内容做文本分析的同时,通过并行的新词识别算法识别微博内容中的新词,并利用搜索引擎的相关搜索对重要文本特征做语义扩展,解决了微博文本内容短小,特征稀疏,无意义特征过多,有区分度的特征较少等一系列问题。 4.本文利用用户在不同领域的分类权重,基于用户间的回复和转发微博关系,构建领域相关的影响力传播模型,经过对比验证,该方法具有不错的效果。
[Abstract]:In recent years, with the rapid development of the Internet, the Internet has become the main channel for people to obtain information in their daily life. Weibo as a rapid development in recent years network emerging media, has accumulated hundreds of millions of users. Weibo platform contains a large amount of information, information update speed, often make users submerged in the ocean of information, help users to find the influential user published by Weibo information is of great significance. Weibo platform launched the search function is to help users to find a good way to 348 _person1# information. Traditional information retrieval includes three key factors: relevance, authority and timeliness. Weibo's platform is of great significance because of its fast updating and non-standard content expression, so timeliness and authoritativeness are often more important. The influence analysis of this paper is also an authoritative study. This article uses Weibo data, carries on the analysis to the user's influence, the main achievement includes the following contents: 1. Weibo data acquisition. At the beginning of this paper, a large amount of user data was captured from Weibo platform, including user's detailed information, user concern relationship, reply forwarding relationship and so on. This part of data is the basic work of this study, but also can be used as Weibo other basic data. 2. The aim of this paper is to identify the influence of Weibo in different fields. According to Weibo's content published by users and the relationship of concern between users, this paper divides the user's domain into two parts, and gets the weight of user's every domain. The method is proved to be more accurate by semiautomatic labeling samples. 3. In this paper, we analyze the text of Weibo published by users, and recognize the new words in Weibo content by parallel neologism recognition algorithm, and extend the semantic features of important text by using the relevant search engine. It solves a series of problems, such as short content, sparse features, too many meaningless features, less distinguishing features and so on. 4. Based on users' classification weights in different domains and based on the relationship between users' reply and forwarding Weibo, a domain-related influence propagation model is constructed in this paper. The results show that this method has a good effect.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP393.092

【参考文献】

相关硕士学位论文 前1条

1 田军伟;基于社会网络的用户兴趣模型研究[D];电子科技大学;2010年



本文编号:2329497

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2329497.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户86348***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com