突发事件微博新话题检测与跟踪系统的设计与实现
发布时间:2018-07-16 10:07
【摘要】:相对于传统的新闻媒体,微博以其简短快速便捷的特性迅速成为传递公众话题信息的媒介,并成长为人们获取与发布信息的主要渠道之一。目前,各类微博平台已经积累了大量的用户使用群体,并且其每天的信息发布量巨大。因此,基于话题检测与跟踪的研究也开始从传统新闻领域向微博平台进行转移。而突发事件的发生关乎国家安全与社会稳定,这使得针对各类突发事件的话题检测与跟踪在应对灾害事故和进行危机处理有着显著而重要的意义。本文着眼于微博平台,使用话题检测技术挖掘突发事件中相关的话题信息,并以此基础进行微博网络话题传播方式的建模研究。 本文采用理论与实践相结合的研究方法,分别从国内外研究现状调研,微博数据采集,话题检测以及突发事件微博传播网络建模四个方面依次展开。首先基于国内外研究在数据挖掘、复杂网络等方面已有的技术优势和研究成果,通过密切关注国内外相关研究的最新动态,提出本文在突发事件的微博新话题检测与跟踪领域可以补充或创新的论点。其次,在新浪微博上采集用于微博话题研究的数据集,包括设计针对微博平台的爬虫程序,以及对其数据进行提取、预处理并存入数据库等操作。再次,在完成数据信息收集的基础上,检测微博中突发事件的相关话题。但由于微博内容的简短性以及口语化,导致常用的文本聚类方法并不适用于微博文本的话题聚类。因此,本文选用LDA算法作为研究微博话题的检测算法,并在此基础上,针对微博数据集中的背景噪声问题,进一步提出改进后的TC-LDA算法。最后,展开突发事件微博传播网络的研究。通过使用Petri网和Agent相结合的研究方法对传播网络进行建模,使之可以从微观个体和宏观系统两个方面观测微博网络的传播特性。 综上所述,本文通过对突发事件在微博平台的话题检测与跟踪研究,设计实现一个从微博数据采集,话题检测以及建立微博传播模型的系统,为各类突发事件的检测,预警以及应急管理提供决策参考。
[Abstract]:At present, all kinds of Weibo platforms have accumulated a large number of user groups, and the amount of information released every day is huge. Therefore, the research based on topic detection and tracking began to shift from the traditional news field to the Weibo platform. The thesis points out that this paper can supplement or innovate in the field of Weibo new topic detection and tracking. Thirdly, based on the completion of data collection, the related topics of unexpected events in Weibo are detected. However, due to the brevity and colloquialization of Weibo, the commonly used text clustering methods are not suitable for topic clustering of Weibo texts. Therefore, this paper selects LDA algorithm as the detection algorithm of Weibo topic, and on this basis, proposes an improved TC-LDA algorithm for the background noise problem in Weibo dataset. Finally, the research on the Weibo communication network of unexpected events is carried out. To sum up, this paper designs and implements a system from Weibo data acquisition, topic detection and Weibo propagation model to detect all kinds of emergencies through the research of topic detection and tracking on Weibo platform. Early warning and emergency management provide decision-making reference.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.092
本文编号:2126051
[Abstract]:At present, all kinds of Weibo platforms have accumulated a large number of user groups, and the amount of information released every day is huge. Therefore, the research based on topic detection and tracking began to shift from the traditional news field to the Weibo platform. The thesis points out that this paper can supplement or innovate in the field of Weibo new topic detection and tracking. Thirdly, based on the completion of data collection, the related topics of unexpected events in Weibo are detected. However, due to the brevity and colloquialization of Weibo, the commonly used text clustering methods are not suitable for topic clustering of Weibo texts. Therefore, this paper selects LDA algorithm as the detection algorithm of Weibo topic, and on this basis, proposes an improved TC-LDA algorithm for the background noise problem in Weibo dataset. Finally, the research on the Weibo communication network of unexpected events is carried out. To sum up, this paper designs and implements a system from Weibo data acquisition, topic detection and Weibo propagation model to detect all kinds of emergencies through the research of topic detection and tracking on Weibo platform. Early warning and emergency management provide decision-making reference.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.092
【参考文献】
相关期刊论文 前6条
1 张晨逸;孙建伶;丁轶群;;基于MB-LDA模型的微博主题挖掘[J];计算机研究与发展;2011年10期
2 李保利,俞士汶;话题识别与跟踪研究[J];计算机工程与应用;2003年17期
3 闵可锐;赵迎宾;刘昕;赵泽宇;闫华;;互联网话题识别与跟踪系统设计及实现[J];计算机工程;2008年19期
4 骆卫华;于满泉;许洪波;王斌;程学旗;;基于多策略优化的分治多层聚类算法的话题发现研究[J];中文信息学报;2006年01期
5 洪宇;张宇;刘挺;李生;;话题检测与跟踪的评测及研究综述[J];中文信息学报;2007年06期
6 周刚;邹鸿程;熊小兵;黄永忠;;MB-SinglePass:基于组合相似度的微博话题检测[J];计算机科学;2012年10期
,本文编号:2126051
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2126051.html