当前位置:主页 > 科技论文 > 软件论文 >

基于Word2Vec的情感词典自动构建与优化

发布时间:2018-06-08 00:25

  本文选题:情感分析 + 多元情感分类 ; 参考:《计算机科学》2017年01期


【摘要】:情感词典的构建是文本挖掘领域中重要的基础性工作。近几年,情感词典的极性标注从二元褒贬标注向多元情绪标注发展,词典的领域特性也日趋明显。但是情感类别的手工标注不但费时费力,而且情感强度难以得到准确量化,同时对领域性的过分关注也大大限制了情感词典的适用性[1]。通过神经网络语言模型对大规模中文语料进行统计训练,并在此基础上提出了基于转换约束集的多维情感词典自动构建方法;然后研究了基于词分布密度的感情色彩消歧方法,对兼具褒贬意味词语的感情极性进行区分和识别,并分别计算两种感情色彩下的情感类别与强度;最后提出基于多个语义资源的全局优化方案,得到包含10种情绪标注的多维汉语情感词典SentiRuc。实验证实该词典1)在类别标注检验、强度标注检验、情感消歧效果及情感分类任务中均具有良好的效果,其中的情感强度检验证实该词典具有极强的情感语义描述力。
[Abstract]:The construction of emotion dictionary is an important basic work in the field of text mining. In recent years, polarity tagging in emotional dictionaries has developed from binary praise and derogation to multivariate emotional tagging, and the domain characteristics of the dictionaries have become more and more obvious. However, the manual labeling of emotion categories is time-consuming and laborious, and the intensity of emotion is difficult to be accurately quantified. At the same time, too much attention to domain also limits the applicability of emotion dictionaries [1]. The neural network language model is used to train the large scale Chinese corpus, and on the basis of this, an automatic construction method of multi-dimensional emotion dictionary based on the transformation constraint set is proposed, and then the emotion color disambiguation method based on word distribution density is studied. The emotional polarity of both positive and negative words is distinguished and recognized, and the emotion categories and intensity under two kinds of emotional colors are calculated respectively. Finally, a global optimization scheme based on multiple semantic resources is proposed. A multi-dimensional Chinese emotion dictionary named SentiRuc. which contains 10 kinds of emotion-tagging is obtained. The experimental results show that the dictionary has good results in category labeling test, intensity labeling test, emotional disambiguation effect and emotion classification task, and the emotional strength test proves that the dictionary has strong affective semantic description ability.
【作者单位】: 中国人民大学信息学院;
【基金】:国家自然科学基金(71271209) 北京市自然科学基金(4132067) 教育部人文社会科学青年基金(11YJC630268) 数字出版技术国家重点实验室开放课题资助
【分类号】:TP391.1

【相似文献】

相关硕士学位论文 前1条

1 朱雪梅;基于Word2Vec主题提取的微博推荐[D];北京理工大学;2014年



本文编号:1993506

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1993506.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户66289***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com