微博话题的公众情感分析技术研究
本文选题:微博 切入点:公众情感分析 出处:《解放军信息工程大学》2015年硕士论文 论文类型:学位论文
【摘要】:随着Web2.0的兴起和迅速发展,互联网上涌现出大量以微博为代表的社交媒体。微博凭借其短小精悍、发布便捷和更新快速等特点,已经成为公众获取信息和交流情感的重要平台。微博话题传播速度快、社会影响大,为公众的信息获取、分享和传播提供了便捷的服务,同时也为敌对势力和不法分子传播失实言论、引发公众负面情感提供了渠道。因此,有效的对微博话题的公众情感进行分析,能够为政府部门了解公众民意和制定高效决策提供支持,对微博舆论监控和引导具有重要意义。本文研究微博话题的公众情感分析技术,主要包括微博话题追踪、微博情感分析和微博话题公众情感分析三个部分。论文的主要研究成果如下:(1)研究了微博话题追踪技术,针对传统方法往往在微博话题追踪中忽略了特征之间的语义信息,导致追踪效果不够理想的问题,提出一种基于词向量的微博话题追踪方法。首先,使用神经网络语言模型在大规模数据集上训练,得到能够准确表示词语语义的词向量;然后,利用词向量扩展特征向量的语义信息,建立初始话题和微博模糊集合;最后,计算微博模糊集合和初始话题模糊集合之间的相似度,并依据设定阈值进行判决,完成话题追踪。在微博话题语料上进行实验,该方法的综合F1值达到85.71%,比传统方法提高了5%,表明基于词向量的微博话题追踪方法能够充分利用词向量引入的语义信息,从语义层面完成话题追踪,相比传统方法能够有效提高微博话题追踪性能。(2)研究了微博情感分析技术,针对传统的无监督微博情感分析方法不能很好地解决微博语料特征稀疏的问题,提出一种基于BTM(Biterm Topic Model)的无监督微博情感分析方法。首先,利用BTM模型对微博语料中的共现词对进行建模,挖掘文档中的隐含主题;然后,利用合并的情感词典计算隐含主题的情感分布;最后,结合文档的主题分布和主题的情感分布计算微博的情感倾向,完成情感分析。在NLPCC2012评测语料上实验,该方法的平均F1值达到75.88%,比传统方法提高了15%,表明基于BTM的无监督微博情感分析方法能够有效解决微博语料特征稀疏对情感分析的影响,在无监督的情况下准确得到微博的情感倾向。(3)研究了微博话题公众情感分析技术,针对已有的相关研究忽视或者不能准确的对公众情感进行描述和分析,导致无法满足微博舆论监控和高效决策需求的问题,提出一种有效的微博话题公众情感分析方法。首先,抽取微博话题的正负面情感摘要,对公众情感进行描述;然后,利用提出的三种指标对公众情感进行分析,得到公众对话题的情感倾向;最后,利用提出的引导句生成方法来引导公众情感。在微博话题语料上进行实验,该方法的综合F1值达到54.95%,比传统方法提高了11%,表明该方法不但能够提高微博话题情感摘要的综合性能,而且能够准确得到公众对话题的情感倾向,并有效引导公众情感。
[Abstract]:With the rise and rapid development of the Web2.0 on the Internet, the emergence of a large number of micro-blog as the representative of the social media. Micro-blog with its convenient and fast update and release characteristics, has become an important platform for public access to information and exchange emotions. Micro-blog topic propagation speed, social influence, access to public information provided convenient service sharing and dissemination, but also for the hostile forces and criminals to spread false statements, causing the public negative emotion provides channels. Therefore, the topic of micro-blog public sentiment and effective analysis, to understand public opinion and support the establishment of efficient decision-making for government departments, has the important meaning to the micro-blog public opinion monitoring and guidance. This paper studies the topic of micro-blog public sentiment analysis technology, including micro-blog micro-blog topic tracking, sentiment analysis and micro-blog the topic of public sentiment analysis three Parts. The main results are as follows: (1) the micro-blog research topic tracking technology, traditional methods are often in the micro-blog topic tracking in ignoring the semantic information between features, resulting in tracking effect is not ideal, put forward a topic tracking method based on micro-blog word vector. Firstly, using neural network language model training in large data sets, can get accurate word semantic vector; then, by using the semantic information word vector expansion feature vector, the establishment of the initial topic and micro-blog fuzzy set; finally, micro-blog calculates fuzzy sets and fuzzy similarity between the initial topic set, and on the basis of threshold judgment, complete the topic tracking. The experiment in the micro-blog topic corpus, the method of the comprehensive F1 value reached 85.71%, 5% higher than the traditional method, that followed the micro-blog word vector based on topic Semantic information can make full use of the word vector is introduced, from the semantic level of topic tracking, compared with the traditional method can effectively improve the performance of topic tracking micro-blog. (2) studied micro-blog emotion analysis technique for unsupervised micro-blog emotion traditional analysis methods can't solve the micro-blog characteristics of corpus sparse problem, proposed one kind based on the BTM (Biterm Topic Model) unsupervised micro-blog sentiment analysis method. First, the micro-blog in the corpus co-occurrence of words using BTM model, mining theme in the document; then calculate the implied theme by emotional distribution combined sentiment dictionary; finally, combined with the theme of emotional distribution distribution and theme of the document calculate the sentiment orientation of micro-blog, complete sentiment analysis. Experiments on the NLPCC2012 corpus, this method of average F1 value reached 75.88%, 15% higher than the traditional method that based on BT Micro-blog M unsupervised sentiment analysis method can effectively solve the influence analysis of micro-blog characteristics of corpus for emotion exactly sparse, emotional tendency of micro-blog without supervision. (3) the micro-blog research topic of public sentiment analysis technology, aiming at the existing research ignored or cannot accurately describe and analyze the public sentiment micro-blog, could not meet the demand of the public opinion monitoring and efficient decision-making problems, put forward a kind of effective micro-blog topic of public sentiment analysis method. First, the positive and negative emotion abstract from micro-blog topic, describe the public sentiment; then, the public sentiment was analyzed by using the three indicators proposed by the public sentiment orientation of the topic finally, using the proposed guidance; sentence generation method to guide the public emotion. Experiments were carried out on the micro-blog topic corpus, the method of the comprehensive F1 value reached 54.95%, compared with the traditional The proposed method improves 11%. It shows that this method can not only improve the comprehensive performance of micro-blog topic sentiment summarization, but also get the public's emotional inclination to the topic accurately, and effectively guide the public emotion.
【学位授予单位】:解放军信息工程大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP391.1
【参考文献】
相关期刊论文 前10条
1 刘培玉;张艳辉;朱振方;荀静;;融合表情符号的微博文本倾向性分析[J];山东大学学报(理学版);2014年11期
2 朱玺;董喜双;关毅;刘志广;;基于半监督学习的微博情感倾向性分析[J];山东大学学报(理学版);2014年11期
3 荀静;刘培玉;杨玉珍;张艳辉;;基于潜在狄利克雷分布模型的多文档情感摘要[J];计算机应用;2014年06期
4 史伟;王洪伟;何绍义;;基于微博平台的公众情感分析[J];情报学报;2012年11期
5 冯时;付永陈;阳锋;王大玲;张一飞;;基于依存句法的博文情感倾向分析研究[J];计算机研究与发展;2012年11期
6 谢丽星;周明;孙茂松;;基于层次结构的多策略中文微博情感分析和特征抽取[J];中文信息学报;2012年01期
7 刘志明;刘鲁;;基于机器学习的中文微博情感分类实证研究[J];计算机工程与应用;2012年01期
8 谢耘耕;荣婷;;微博舆论生成演变机制和舆论引导策略[J];现代传播(中国传媒大学学报);2011年05期
9 刘宗田;黄美丽;周文;仲兆满;付剑锋;单建芳;智慧来;;面向事件的本体研究[J];计算机科学;2009年11期
10 龚书;瞿有利;田盛丰;;基于语义的自动文摘研究综述[J];北京交通大学学报;2009年05期
相关硕士学位论文 前1条
1 喻琦;中文微博情感分析技术研究[D];浙江工商大学;2013年
,本文编号:1638350
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/1638350.html