网络评论文本的细粒度情感分析研究
本文关键词: 细粒度 情感歧义 情感词典 情感要素 CRFs 出处:《山东师范大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着网络评论文本的爆炸式增长,评论中承载了大量的用户情感信息,分析评论的整体倾向性已经不能满足当前用户的需求,迫切需要更细粒度属性层面的情感分析,并且由于用户表达随意性造成的分词准确率过低,情感要素抽取准确率低和隐式情感信息丢失等问题也急需解决。本文首先对垃圾评论过滤和中文分词两种文本预处理任务进行了分析;其次基于CRFs模型对情感要素进行抽取,补充隐式情感对象后聚合处理;然后提出一种对聚合后特征类的对立观点进行情感强度分析的算法。本文研究内容有以下四个部分:(1)针对文本预处理问题,基于构建的评论特征分类来识别垃圾评论,并构建用户词典改善中文分词本文首先基于构建的评论特征进行文本分类,包括主客观文本分类,过滤掉垃圾观点信息评论数据,保留真实有价值的评论文本信息进行情感分析任务,并进行意群划分,便于后续语义情感聚合处理;中文分词采用NLPIR分词系统,基于新词、网络词汇和领域术语类关键词等未登录词构建用户词典,既可以纠正分词错误,提高情感对象抽取的准确率,又可以作为情感词典的补充,减少用户情感信息的丢失。(2)基于CRFs模型抽取情感要素,将情感对象、情感词及情感修饰词的联合识别任务转化为结构化序列标注任务采用条件随机场模型联合识别情感要素,首先选取特征构建特征模板和标注集,然后基于CRFs联合识别情感要素,利用显式情感对象-情感词对和评论语料中标签集组成的产品特征观点对构建训练文档,采用朴素贝叶斯分类器识别隐式情感对象,最后通过词义代码实现情感对象聚合,改进特征稀疏性问题。(3)提出了基于语境情感消岐的对立观点情感强度分析算法本文首先依据情感词的动态极性定义了情感歧义词,利用关联规则挖掘情感歧义词语搭配集,PMI剪枝过滤后构建出情感歧义词搭配词典,然后介绍了构建的网络词典及情感修饰词典等,提出了对立观点情感强度计算的方法,最后依据情感强度生成对立观点情感摘要完成细粒度情感分析,实验表明了本文词典构建及情感强度计算方法的有效性。(4)设计并实现了评论文本细粒度情感分析系统本文实现了细粒度情感分析系统,该系统各功能模可以完成评论采集、垃圾评论过滤、中文分词、情感要素抽取和细粒度情感分析全过程,并最终提供给用户直观的包含对立观点强度信息的细粒度分析结果。
[Abstract]:With the explosive growth of the network comment text, the commentary carries a large amount of user emotional information, the overall tendency of the analysis of comments can no longer meet the needs of current users. There is an urgent need for more fine-grained attribute level emotional analysis, and the segmentation accuracy caused by the randomness of user expression is too low. The problems of low accuracy rate of emotion extraction and loss of implicit emotional information are also urgently needed to be solved. Firstly, the text preprocessing tasks of spam comment filtering and Chinese word segmentation are analyzed in this paper. Secondly, based on the CRFs model, the emotion elements are extracted and the implicit affective objects are processed by post-aggregation. Then we propose an algorithm to analyze the affective intensity of the opposite view of the aggregated feature class. In this paper, there are four parts: 1) to deal with the text preprocessing problem. Based on the constructed comment feature classification to identify spam comments, and build a user dictionary to improve the Chinese word segmentation this paper first based on the constructed comment features for text classification, including subjective and objective text classification. Filtering out the comment data of spam view information, retaining the real and valuable comment text information for emotional analysis task, and dividing the semantic group, so as to facilitate the subsequent semantic emotional aggregation processing; The Chinese word segmentation system uses NLPIR word segmentation system, based on new words, network vocabulary and domain terms and other unrecorded words to build a user dictionary, which can correct segmentation errors and improve the accuracy of emotional object extraction. Can also be used as an affective dictionary to reduce the loss of user emotional information. 2) based on the CRFs model to extract emotional elements, emotional objects. The joint recognition task of affective words and affective modifiers is transformed into structured sequence tagging tasks. The conditional random field model is used to jointly identify emotional elements. Firstly, feature templates and tagging sets are constructed by selecting features. Then, based on CRFs, the training document is constructed by using the product feature viewpoint of explicit affective object-affective word pair and tag set in the comment corpus. The naive Bayesian classifier is used to identify the implicit emotional objects, and finally the semantic code is used to aggregate the emotional objects. Improved feature sparsity problem. (3) this paper proposes an analysis algorithm of affective intensity based on contextual emotional disambiguation. In this paper, we first define emotional ambiguity according to the dynamic polarity of affective words. Using association rules to mine affective ambiguity words collocation set PMI pruning filter to construct affective ambiguity words collocation dictionary then introduced the network dictionary and affective modification dictionary and so on. A method for calculating the emotional intensity of opposites is proposed. Finally, the fine grain emotional analysis is completed according to the emotional summary of the opposing viewpoints. Experiments show that the dictionary construction and the validity of the calculation method of emotional strength. 4) designed and implemented the fine-grained emotional analysis system of comment text. In this paper, the fine-grained emotional analysis system is implemented. Each functional module of the system can complete the whole process of comment collection, garbage comment filtering, Chinese word segmentation, emotion element extraction and fine-grained emotion analysis. Finally, the fine-grained analysis results containing the strength information of opposing views are provided to the user.
【学位授予单位】:山东师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
【参考文献】
相关期刊论文 前10条
1 梅莉莉;黄河燕;周新宇;毛先领;;情感词典构建综述[J];中文信息学报;2016年05期
2 王科;夏睿;;情感词典自动构建方法综述[J];自动化学报;2016年04期
3 刘丽;王永恒;韦航;;面向产品评论的细粒度情感分析[J];计算机应用;2015年12期
4 邱云飞;倪学峰;邵良杉;;商品隐式评价对象提取的方法研究[J];计算机工程与应用;2015年19期
5 孙晓;唐陈意;;基于层叠模型细粒度情感要素抽取及倾向分析[J];模式识别与人工智能;2015年06期
6 刘丽珍;赵新蕾;王函石;聂欣慧;宋巍;;基于产品特征的领域情感本体构建[J];北京理工大学学报;2015年05期
7 韩冬煦;常宝宝;;中文分词模型的领域适应性方法[J];计算机学报;2015年02期
8 陈燕方;李志宇;;基于评论产品属性情感倾向评估的虚假评论识别研究[J];现代图书情报技术;2014年09期
9 戴敏;王荣洋;李寿山;朱珠;周国栋;;基于句法特征的评价对象抽取方法研究[J];中文信息学报;2014年04期
10 王昌厚;王菲;;使用基于模式的Bootstrapping方法抽取情感词[J];计算机工程与应用;2014年01期
相关会议论文 前2条
1 林琛;汪卫;;Web论坛上的垃圾贴过滤[A];第26届中国数据库学术会议论文集(B辑)[C];2009年
2 姚天f ;聂青阳;李建超;李林琳;娄德成;陈珂;付宇;;一个用于汉语汽车评论的意见挖掘系统[A];中文信息处理前沿进展——中国中文信息学会二十五周年学术会议论文集[C];2006年
相关博士学位论文 前3条
1 江腾蛟;基于句法和语义挖掘的Web金融评论情感分析[D];江西财经大学;2015年
2 黄胜;Web评论文本的细粒度意见挖掘技术研究[D];北京理工大学;2014年
3 杨玉珍;基于Web评论信息的倾向性分析关键技术研究[D];山东师范大学;2014年
相关硕士学位论文 前2条
1 荀静;基于LDA模型的文档情感摘要研究[D];山东师范大学;2015年
2 曾令伟;产品评论中隐式评价对象的抽取研究[D];上海交通大学;2014年
,本文编号:1486932
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1486932.html