当前位置:主页 > 科技论文 > 自动化论文 >

基于递归神经网络的微博情感分类研究

发布时间:2018-03-17 01:38

  本文选题:微博文本 切入点:情感分类 出处:《浙江理工大学》2017年硕士论文 论文类型:学位论文


【摘要】:作为近年来快速发展的社交网络平台,微博由于易操作,传播快,灵活度高等特点,已得到用户的普遍推崇和使用。虽然用户发布的微博内容很繁杂,但通过对其观察和分析发现,其中潜藏着大量的有用信息,尤其是微博文本中包含的情感倾向,有助于政府和企业了解大众需求、引导舆论、发现商机、提高收益。目前,针对微博文本的情感分类研究越来越受到相关领域学者的关注。如何学习深层语义、有效表示文本特征、提高情感分类效果一直是相关领域要研究的目标。本文主要研究了微博文本情感分类的两大方面:微博文本主客观分类和微博文本情感极性分类。在主客观分类阶段,提出了基于词典和语料相结合的方法。在情感极性分类阶段,对微博文本的特征提取方法和分类算法分别进行了研究。其中,针对特征提取,提出了基于浅层和深层学习的特征融合方法;针对分类算法,提出一种基于改进的递归神经网络的情感分类方法。本文的主要工作和创新成果具体如下:(1)针对微博文本的主客观分类问题,提出了基于词典和语料相结合的方法。首先根据本文所构建的可靠情感词典对可靠度较高的主观性文本进行识别,然后结合语料统计的方法对剩余文本进行主客观分类,最终得到的F1值比传统的基于大规模情感词典的主客观分类方法要高出6.72%。(2)鉴于一般的浅层学习特征忽略了文本内在语义,提出一种基于浅层和深层学习的特征融合方法。其中浅层学习特征选取了词、词性和词典这三类特征,深层学习特征利用word2vec工具进行提取,然后对它们进行融合。实验结果表明,特征融合后的微博文本情感极性分类效果要优于仅采用其中任何一种特征的效果。(3)针对微博文本的情感极性分类问题,采用一种改进的递归神经网络模型。该模型将一般递归神经网络的隐藏层替换成LSTM结构,使得在情感分类过程中,不仅把文本序列前后的相关性考虑在内,而且能够学习到文本中距离较远的相关信息。实验最终得到85.04%的分类准确率,比传统的采用基于浅层学习特征的支持向量机方法提高了3.17%。
[Abstract]:As a social network platform for rapid development in recent years, micro-blog because of easy operation, fast spread, high flexibility, and has been widely praised by users. Although micro-blog content posted by users is very complicated, but through the observation and analysis, which hides the large amount of useful information, especially the emotional tendency includes micro-blog in the text, to help the government and enterprises to understand the needs of the masses, to guide public opinion, find business opportunities, increase revenue. At present, the research of micro-blog text sentiment classification more andmore concerned by the researchers. How to learn the deep semantic, effective text representation, improve the emotion classification effect has been to research target this paper mainly studies the two aspects of text sentiment classification: micro-blog micro-blog and micro-blog text subjective classification text sentiment polarity classification. In the subjective classification stage, put forward The method is based on the combination of dictionary and corpus. In the classification phase polarity, feature extraction and classification algorithm of micro-blog text were studied. Among them, according to the feature extraction, feature fusion method is proposed based on the study of shallow and deep; according to the classification algorithm, proposed a modified recursive neural network classification algorithm on the basis of the main work and innovations are as follows: (1) according to the classification of the micro-blog text subjective and objective method is proposed based on the combination of dictionary and corpus. Firstly, according to the identification of reliable sentiment dictionary reliable subjective text, and then combined with the method of corpus statistics on the remaining the text of the subjective classification, the final F1 value than the large-scale emotion dictionary subjective and objective classification method based on the traditional 6.72%. higher (2) in view of the general theory of shallow Xi features ignored the internal semantic text, proposes a fusion feature study of shallow and deep and shallow learning method based on feature selection of words, these three kinds of features of speech and dictionary, deep learning features are extracted by using word2vec tools, and then they are fused. The experimental results show that the emotional micro-blog text polarity classification the effect of the fused features is better than using only the effect of any one feature. (3) aiming at the problem of classification of micro-blog text polarity, using an improved recursive neural network model. The model of general recursive neural network hidden layer is replaced by the LSTM structure, the emotion classification process, not only take into account the correlation sequence before and after the text, and to be able to learn the relevant information in the text of the distance. The experiment eventually get 85.04% classification accuracy, compared with the traditional media in the shallow The learning feature support vector machine method improves the 3.17%.

【学位授予单位】:浙江理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1;TP183

【参考文献】

相关期刊论文 前8条

1 李婷婷;姬东鸿;;基于SVM和CRF多特征组合的微博情感分析[J];计算机应用研究;2015年04期

2 贺飞艳;何炎祥;刘楠;刘健博;彭敏;;面向微博短文本的细粒度情感特征抽取方法[J];北京大学学报(自然科学版);2014年01期

3 张珊;于留宝;胡长军;;基于表情图片与情感词的中文微博情感分析[J];计算机科学;2012年S3期

4 谢丽星;周明;孙茂松;;基于层次结构的多策略中文微博情感分析和特征抽取[J];中文信息学报;2012年01期

5 刘挺;车万翔;李正华;;语言技术平台[J];中文信息学报;2011年06期

6 赵妍妍;秦兵;车万翔;刘挺;;基于句法路径的情感评价单元识别[J];软件学报;2011年05期

7 赵妍妍;秦兵;刘挺;;文本情感分析[J];软件学报;2010年08期

8 李晓红;;中文文本分类中的特征词抽取方法[J];计算机工程与设计;2009年17期



本文编号:1622610

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/1622610.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户98c8f***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com