当前位置:主页 > 管理论文 > 移动网络论文 >

基于时间发展的微博自适应话题追踪研究

发布时间:2018-11-27 08:10
【摘要】:随着互联网的快速发展,社交网络由于其交互性、自由性和开放性受到越来越多的人的青睐。自从2006年,世界首款微博客(以下简称微博)服务网站—Twitter由美国的埃文-威廉姆斯公司Obvious推出以来,微博服务蒸蒸日上,堪称蓬勃发展。微博不同于传统的新闻、博客,其内容简短,限制在140字以内。但是,用户除了可以在自己的微博内容里加入简短的文本以外,还可以加入图片、视频、音频和其他链接等。这种自由、开放的传播方式,受到了广大用户的欢迎和关注,同时,微博服务也在全球各地快速传播,掀起了一股微博服务的热潮。 由于微博的自由性、交互性和开放性,人们可以随时随地分享自己的所见所闻或发表自己的情感态度。随着微博用户的急剧增长,微博信息量日益剧增,一些突发事件往往也容易在微博平台显现出来。因此,现阶段微博话题检测研究正受到研究学者的关注,正逐渐成为研究热点。但是,人们有时更关注某一事件的发展状况,因此微博话题追踪显得尤为重要。为了充分利用微博的时间敏感特性,及时检测和追踪微博热点话题,本文进行了如下研究: 1.针对微博信息量大而时间敏感性强的特点,,提出基于速度增长的微博话题发现方法 本文提出了基于速度增长的微博热点话题发现方法。首先把经过预处理的微博按等数量窗口划分,统计每个窗口内各词语的词频,并表示成时间二元组序列;然后通过计算每相邻两个窗口的个词语的增长斜率来发现增长速度快的词语;然后通过计算与该词语有关的用户的增长速度和微博条数的增长速度来确定该词语是否是热点主题词;最后通过热点主题词聚类产生热点话题。结果表明,该方法对新话题有很强的的挖掘能力。 2.针对话题追踪中的话题漂移问题,提出了基于时间发展的微博自适应话题追踪方法 该方法首先针对微博追踪中的数据稀疏问题,利用基于相关性检索的特征词扩展方法来扩展特征词;然后针对特征词权重不变容易导致召回率低的问题,利用基于时间衰减的特征词权重调整策略对特征词权重进行适当的衰减;最后针对话题模板静态不变问题,提出了基于双重过滤技术的话题模板调整方法,把相关报道且重要性得分高的报道用来更新话题模板。实验表明该方法在一定程度上提高了追踪效率。 3.设计并实现了基于时间发展的微博自适应话题追踪算法的网络舆情监测系统 将本文提出的自适应话题追踪方法应用于网络舆情监测系统中的话题追踪模块的话题模板调整,利用重要性得分高的微博条目更新话题模板,使系统有更高的召回率和准确率,满足用户的需求。
[Abstract]:With the rapid development of Internet, more and more people favor social network because of its interactivity, freedom and openness. Since 2006, when Twitter, the world's first Weibo service site, was launched by Obvious of Evan Williams, the service has flourished and flourished. Weibo, unlike the traditional news, blog, its content is short, limited to 140 words. However, in addition to adding short text to Weibo's content, users can also add pictures, videos, audio and other links. This kind of free and open mode of communication has been welcomed and concerned by the vast number of users. At the same time, Weibo service has spread rapidly all over the world, setting off an upsurge of Weibo service. Because of Weibo's freedom, interactivity and openness, people can share what they see and hear at any time or express their emotional attitude. With the rapid growth of Weibo users, the amount of Weibo information is increasing day by day, and some unexpected events often appear easily on Weibo platform. Therefore, at present, Weibo topic detection research is being paid attention by researchers, and is becoming a research hotspot. However, people sometimes pay more attention to the development of an event, so Weibo topic tracking is particularly important. In order to make full use of Weibo's time sensitive characteristics and to detect and track the hot topics of Weibo in time, this paper has carried out the following research: 1. In view of Weibo's characteristics of large amount of information and strong time sensitivity, this paper puts forward a method of topic discovery based on speed growth for Weibo, which is a hot topic discovery method based on speed growth. Firstly, Weibo is divided according to the same number of windows, the frequency of each word in each window is counted, and the binary sequence of time is expressed. Then the fast growing words are found by calculating the growth slope of each of the two adjacent windows. Then we calculate the growth rate of users and Weibo number to determine whether the word is a hot topic word. Finally, hot topic words are generated by clustering hot topic words. The results show that the method has a strong ability to mine new topics. 2. In order to solve the topic drift problem in topic tracking, an adaptive topic tracking method for Weibo based on time development is proposed. The extended method of feature words based on relevance retrieval is used to extend the feature words. Secondly, aiming at the problem that the weight of feature words is invariable, the weight adjustment strategy based on time attenuation is used to reduce the weight of feature words. Finally, aiming at the static invariance of topic template, a topic template adjustment method based on double filtering technology is proposed, which uses the related reports with high importance score to update the topic template. Experiments show that this method improves the tracking efficiency to some extent. 3. The monitoring system of network public opinion based on Weibo adaptive topic tracking algorithm based on time development is designed and implemented. The adaptive topic tracking method proposed in this paper is applied to the topic tracking module of network public opinion monitoring system. Topic template adjustment, Using Weibo entry with high importance score to update topic template makes the system have higher recall rate and accuracy rate and meet the needs of users.
【学位授予单位】:山东师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.092;TP391.1

【参考文献】

相关期刊论文 前7条

1 贾自艳 ,何清 ,张海俊 ,李嘉佑 ,史忠植;一种基于动态进化模型的事件探测和追踪算法[J];计算机研究与发展;2004年07期

2 于满泉;骆卫华;许洪波;白硕;;话题识别与跟踪中的层次化话题识别技术研究[J];计算机研究与发展;2006年03期

3 王会珍;朱靖波;季铎;叶娜;张斌;;基于反馈学习自适应的中文话题追踪[J];中文信息学报;2006年03期

4 洪宇;张宇;刘挺;李生;;话题检测与跟踪的评测及研究综述[J];中文信息学报;2007年06期

5 李心妍;刘俐俐;;浅析微博中的“微舆情”[J];新闻世界;2011年07期

6 崔争艳;;基于语义的微博短信息分类[J];现代计算机(专业版);2010年08期

7 谢岚;;微博客的分级化传播模式研究[J];新闻传播;2010年12期



本文编号:2360001

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2360001.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户24bd8***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com