面向微博情感分析的本体自动抽取关键技术研究
发布时间:2018-05-10 13:29
本文选题:微博 + 情感词 ; 参考:《首都师范大学》2014年硕士论文
【摘要】:随着新型互联网应用的迅猛发展,微博快速崛起,用户数达到2.81亿,使用率达到45.5%,每天数以千万人通过微博分享自己对各类话题的观点与情感,如何自动感知微博主体的情感,并从宏观上科学研判微博社区对特定话题的观点倾向性,已经成为微博计算与舆情分析亟待解决的基本科学问题。 然而,以往的情感分析大都是基于整个传统长文本层面,并且由于微博内容短小且不规范,碎片化与主体化特征日益凸显,传统的情感分析算法存在本质缺陷,效率低下且效果很难满足实际需求。利用情感词典分析用户产生内容的情感倾向性是简单有效的方法。但由于情感词典规模有限,同时网络用语新词层出不穷,语言使用不规范,人工整理耗时耗力,领域性强。为解决以上问题,本文提出一种自动挖掘潜在情感词并计算其情感权重的算法,该算法与应用领域无关,具有良好的扩展性。该方法基于贝叶斯原理和大数据挖掘,能够挖掘未知的情感词,并根据其情感权重值的大小判断其情感极性及情感倾向性程度,可有效扩展情感词典,并丰富情感词典的精细化使用,从而实现了情感词库的自动挖掘与获取。同时,在此基础之上,实现情感主体属性的识别,包括观点句识别、情感对象抽取及情感倾向性判断,从而完成情感分析的本体自动抽取。 本文在理论研究的基础上进行算法的实践验证,同时为验证该方法能够实现跨领域,本文又分别针对京东商城、豆瓣、大众点评三组评论语料做了实验。其结果的准确率都基本在90%以上,验证了以上算法的有效性和实用性,为各种互联网应用,不仅仅是微博,提供了情感分析的基础。
[Abstract]:With the rapid development of new Internet applications, Weibo has risen rapidly, with the number of users reaching 281 million and the utilization rate reaching 45.5. Tens of millions of people share their views and feelings on various topics through Weibo every day, and how to automatically perceive the emotions of Weibo subjects. It has become a basic scientific problem to be solved urgently to calculate and analyze public opinion from macroscopic view of Weibo community on specific topic. However, most of the previous emotional analysis is based on the whole traditional long text level, and because Weibo's content is short and non-standard, fragmentation and subjectivity feature is increasingly prominent, the traditional emotional analysis algorithm has essential defects. The efficiency is low and the effect is very difficult to meet the actual demand. It is a simple and effective method to use emotion dictionary to analyze the affective tendency of user generated content. However, due to the limited scale of emotion dictionary, the network neologisms emerge in endlessly, language use is not standardized, manual collation is time-consuming and consuming, and domain is strong. In order to solve the above problems, this paper proposes an algorithm for automatically mining latent emotion words and calculating their emotional weights. The algorithm is independent of the application field and has good expansibility. Based on Bayesian theory and big data mining, this method can mine unknown affective words, judge its emotional polarity and affective tendency according to the magnitude of its emotional weight, and can effectively expand the emotional dictionary. It also enriches the refined use of emotion dictionary, thus realizing the automatic mining and acquisition of emotion lexicon. At the same time, on the basis of this, the recognition of emotional subject attributes is realized, including viewpoint sentence recognition, emotional object extraction and emotional orientation judgment, so that the ontology of emotional analysis can be extracted automatically. In this paper, based on the theoretical research, the algorithm is verified in practice, and in order to verify that the method can achieve cross-domain, this paper respectively aimed at JingDong Mall, Douban, Dianping three groups of comment corpus to do experiments. The accuracy of the results is above 90%, which verifies the validity and practicability of the above algorithms, and provides a basis for emotional analysis for various Internet applications, not only Weibo.
【学位授予单位】:首都师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.092;TP391.1
【参考文献】
相关期刊论文 前8条
1 张晶;朱波;梁琳琳;侯敏;滕永林;;基于情绪因子的中文微博情绪识别与分类[J];北京大学学报(自然科学版);2014年01期
2 杜伟夫;谭松波;云晓春;程学旗;;一种新的情感词汇语义倾向计算方法[J];计算机研究与发展;2009年10期
3 樊鹏翼;王晖;姜志宏;李沛;;微博网络测量研究[J];计算机研究与发展;2012年04期
4 魏椺;向阳;陈千;;中文文本情感分析综述[J];计算机应用;2011年12期
5 朱嫣岚;闵锦;周雅倩;黄萱菁;吴立德;;基于HowNet的词汇语义倾向计算[J];中文信息学报;2006年01期
6 谢丽星;周明;孙茂松;;基于层次结构的多策略中文微博情感分析和特征抽取[J];中文信息学报;2012年01期
7 阳爱民;林江豪;周咏梅;;中文文本情感词典构建方法[J];计算机科学与探索;2013年11期
8 赵妍妍;秦兵;刘挺;;文本情感分析[J];软件学报;2010年08期
,本文编号:1869500
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1869500.html