基于微博话题评论的情感分析研究与应用
发布时间:2019-01-09 15:47
【摘要】:微博是当今非常流行的社交和信息传播平台。2016年,随着里约奥运会与王宝强离婚事件的传播,微博彰显了作为一个信息传播平台的重要地位。在2016年9月,微博月活跃用户达到2.97亿,同比增长34%,日平均活跃用户数量达到1.32亿,同比增长32%。人们通过微博发布消息、转发见闻、评论看法、点赞博文,表达自己对人物和事件的观点,并和其他人交流意见。通过分析微博用户转发和评论的博文,可以快速获知当前的舆论动向和针对特定事务的舆情,为决策者提供巨大参考价值。在企业中,通过用户发布、转发、评论的微博内容中可以分析出用户对产品和服务的喜好程度,这正是本文研究的出发点。基于微博话题的情感分析系统可以快速准确的统计出当前公司或者产品的舆论环境,对于快速决策、危机公关、舆论引导有着重要的应用价值。本文主要针对微博评论进行分析,得到微博评论情感正负极性。本文的主要工作包括:第一,设计爬虫,爬取公司微博以及对应的评论。第二,对数据进行去停用词、分词等处理;第三,基于word2vec得到评论内容对应的词向量,训练了基于支持向量机、卷积神经网络、长短时记忆神经网络的三个分类器,通过对准确率、召回率、F1值以及计算时间等性能指标进行分析对比,选择一个经济实用的算法;第四,设计UI交互界面。为了验证算法的有效性,本文基于公有数据集COAE2013进行评测以保证各种算法的有效性,结果表明长短时记忆神经网络取得了最好的性能;并使用优化后的堆栈长短时记忆神经网络在COAE2013和深圳航空的数据集上进行了实验对比,性能相较于普通的长短时记忆神经网络高1%左右。本文对比实验了目前流行的针对微博短文本分类的方法;另外,为了解决基于微博的语料较少的问题,本文设计了爬虫系统,爬取了大量微博语料,并专门针对特定账号爬取相关博文下的所有评论信息。最后选取了堆栈长短时记忆神经网络模型作为基于微博话题评论情感分析系统的微博评论情感分析方法,搭建了具有可视化、易用性特点的情感分析系统。
[Abstract]:Weibo is a popular social and information dissemination platform. In 2016, with the divorce of Wang Baoqiang from the Rio Olympics, Weibo showed its importance as an information dissemination platform. In September 2016, Weibo reached 297 million monthly active users, up 34 percent from the same month a year earlier. The average number of active users per day reached 132 million, up 32 percent from the same period last year. People send messages through Weibo, retweets news, comments, praises blog posts, expresses their views on people and events, and exchanges views with others. By analyzing the blog posts forwarded and commented by Weibo users, we can quickly find out the current trend of public opinion and the public opinion aimed at specific affairs, and provide great reference value for decision makers. In enterprises, the content of Weibo, which is published, forwarded and commented by users, can be used to analyze the degree of users' preference for products and services, which is the starting point of this paper. The emotion analysis system based on Weibo topic can quickly and accurately statistics the public opinion environment of current company or product. It has important application value for quick decision making, crisis public relations and public opinion guidance. This paper mainly analyzes Weibo's comments, and obtains the positive and negative emotions of Weibo's comments. The main work of this paper includes: first, the design crawler, crawling company Weibo and corresponding comments. Secondly, the data should be treated with deactivation words, participles and so on. Thirdly, three classifiers based on support vector machine, convolutional neural network and long and short memory neural network are trained based on word2vec to get word vector corresponding to comment content. The performance indexes such as F1 value and calculation time are analyzed and compared, and an economical and practical algorithm is selected. Fourth, design the UI interactive interface. In order to verify the validity of the algorithm, this paper evaluates the algorithm based on the public data set COAE2013 to ensure the effectiveness of various algorithms. The results show that the long and short memory neural network has the best performance. The optimized stack long and short time memory neural network is used to compare the data sets of COAE2013 and Shenzhen Airlines. The performance is about 1% higher than that of ordinary long and short term memory neural networks. This paper compares and tests the current popular methods of classifying Weibo's short texts. In addition, in order to solve the problem of less corpus based on Weibo, this paper designs a crawler system, crawls a large number of Weibo corpus, and specifically crawls all comments under related blog posts for specific accounts. Finally, the neural network model of stack long and short time memory is selected as the emotional analysis method of Weibo comment based on Weibo topic comment emotional analysis system, and a visual and easy-to-use emotional analysis system is built.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1;TP393.092
本文编号:2405826
[Abstract]:Weibo is a popular social and information dissemination platform. In 2016, with the divorce of Wang Baoqiang from the Rio Olympics, Weibo showed its importance as an information dissemination platform. In September 2016, Weibo reached 297 million monthly active users, up 34 percent from the same month a year earlier. The average number of active users per day reached 132 million, up 32 percent from the same period last year. People send messages through Weibo, retweets news, comments, praises blog posts, expresses their views on people and events, and exchanges views with others. By analyzing the blog posts forwarded and commented by Weibo users, we can quickly find out the current trend of public opinion and the public opinion aimed at specific affairs, and provide great reference value for decision makers. In enterprises, the content of Weibo, which is published, forwarded and commented by users, can be used to analyze the degree of users' preference for products and services, which is the starting point of this paper. The emotion analysis system based on Weibo topic can quickly and accurately statistics the public opinion environment of current company or product. It has important application value for quick decision making, crisis public relations and public opinion guidance. This paper mainly analyzes Weibo's comments, and obtains the positive and negative emotions of Weibo's comments. The main work of this paper includes: first, the design crawler, crawling company Weibo and corresponding comments. Secondly, the data should be treated with deactivation words, participles and so on. Thirdly, three classifiers based on support vector machine, convolutional neural network and long and short memory neural network are trained based on word2vec to get word vector corresponding to comment content. The performance indexes such as F1 value and calculation time are analyzed and compared, and an economical and practical algorithm is selected. Fourth, design the UI interactive interface. In order to verify the validity of the algorithm, this paper evaluates the algorithm based on the public data set COAE2013 to ensure the effectiveness of various algorithms. The results show that the long and short memory neural network has the best performance. The optimized stack long and short time memory neural network is used to compare the data sets of COAE2013 and Shenzhen Airlines. The performance is about 1% higher than that of ordinary long and short term memory neural networks. This paper compares and tests the current popular methods of classifying Weibo's short texts. In addition, in order to solve the problem of less corpus based on Weibo, this paper designs a crawler system, crawls a large number of Weibo corpus, and specifically crawls all comments under related blog posts for specific accounts. Finally, the neural network model of stack long and short time memory is selected as the emotional analysis method of Weibo comment based on Weibo topic comment emotional analysis system, and a visual and easy-to-use emotional analysis system is built.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1;TP393.092
【参考文献】
相关期刊论文 前5条
1 高琰;陈白帆;晁绪耀;毛芳;;基于对比散度-受限玻尔兹曼机深度学习的产品评论情感分析[J];计算机应用;2016年04期
2 陈钊;徐睿峰;桂林;陆勤;;结合卷积神经网络和词语情感序列特征的中文情感分析[J];中文信息学报;2015年06期
3 梁军;柴玉梅;原慧斌;昝红英;刘铭;;基于深度学习的微博情感分析[J];中文信息学报;2014年05期
4 王文华;朱艳辉;徐叶强;杜锐;鲁琳;邓程;;基于SVM的产品评论属性特征的情感倾向分析[J];湖南工业大学学报;2012年05期
5 徐军;丁宇新;王晓龙;;使用机器学习方法进行新闻的情感自动分类[J];中文信息学报;2007年06期
相关硕士学位论文 前2条
1 李明;面向微博电影评论的情感分类研究[D];云南财经大学;2014年
2 郭伟;网络电影评论的情感挖掘分析[D];吉林大学;2010年
,本文编号:2405826
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2405826.html