基于依存句法分析的中文评价对象抽取和情感倾向性分析
发布时间:2019-01-28 08:26
【摘要】:随着互联网的发展,包含有观点和评论的文本大量涌现。人们一方面浏览别人发表的评论,一方面不停地分享自己对于某些人或物的观点和情感。情感分析能够从互联网上的评论文本中挖掘出群体性的观点,这对于经济发展、政治决策和个体行为都有着极其重要的指引作用。情感分析分为粗粒度和细粒度两种,目前粗粒度情感分析取得了不错的效果,而细粒度情感分析的效果依旧不理想。评价对象抽取和情感倾向性分析是细粒度情感分析的一个重要的子任务。其中,评价对象抽取是该任务性能提高的瓶颈。针对评价对象抽取主要有四种方法,分别是基于寻找频繁出现的名词和名词短语的抽取方法,利用观点词和评价对象的关系进行抽取的方法,使用有监督学习进行抽取的方法,使用主题模型进行抽取的方法。目前很多使用观点词和评价对象的关系进行抽取的方法往往难以精准地抽取出观点词真正关联的评价对象,尤其是评价对象与观点词不在同一子句中的时候。针对该问题,本文在利用中文评论句子中词汇间依存关系的基础上,通过语义角色标注、添加抽取规则和搜索算法,以提高情感分析的性能。论文的主要工作如下:(1)在现有词典的基础上,构建用于情感分析的情感词典,包括:正面情绪词典负面情绪词典、正面评价词典、负面评价词典、观点引述词典、虚拟语气词典、转折词典、名词性情感词典等。这些词典主要用于处理评价句中无用成分或只是表达想法、意愿的非评价句对情感分析的干扰,提供语义规则和倾向性分析需要的词库支持。(2)在依存句法分析的基础上,利用语义角色标注,添加了一系列的抽取规则进行情感分析。同时使用了定中短语(定语和中心语组成的短语)替换通常的名词短语抽取出候选评价对象,用以提高评价对象和观点词的抽取精确度。这些规则主要考虑了中文语义知识、常用句式等对情感分析的影响。实验结果表明,在NLPCC 2013的微博评测语料上,添加语义规则的基于依存句法分析的方法,能够显著提高评价对象的抽取性能。(3)提出一种评价对象搜索方法,用于改善在只抽取出代词或句法关系中无评价对象的情况下,搜索上下文中真正的评价对象的精确度。该方法主要结合了词义和词语相似度计算算法,缩小了上下文中潜在评价对象的搜索范围。实验结果表明,该方法在实验语料上提高了评价对象的抽取精度。
[Abstract]:With the development of the Internet, a large number of texts containing views and comments have emerged. People browse the comments of others and share their views and feelings about certain people or things. Emotional analysis can excavate group views from comments on the Internet, which plays an extremely important role in guiding economic development, political decision-making and individual behavior. There are two kinds of affective analysis: coarse-grained and fine-grained. At present, coarse-grained affective analysis has achieved good results, but the effect of fine-grained affective analysis is still not ideal. Evaluation object extraction and affective orientation analysis are important sub-tasks of fine-grained emotional analysis. Evaluation object extraction is the bottleneck to improve the performance of the task. There are four main methods for evaluation object extraction, one is based on finding frequently occurring nouns and noun phrases, the other is based on the relationship between viewpoint words and evaluation objects, and the supervised learning is used to extract. The method of using topic model to extract. At present, many methods of extracting the relationship between viewpoint words and evaluation objects often find it difficult to accurately extract the evaluation objects that are really related to the opinion words, especially when the evaluation objects and the opinion words are not in the same clause. In order to improve the performance of emotional analysis, this paper aims to improve the performance of affective analysis by adding extraction rules and search algorithms based on the lexical dependencies in Chinese comment sentences and semantic role tagging. The main work of this paper is as follows: (1) on the basis of the existing dictionaries, we construct the emotional dictionaries for emotional analysis, including: positive emotion dictionaries, negative emotion dictionaries, positive evaluation dictionaries, negative evaluation dictionaries, viewpoint citing dictionaries. Subjunctive mood Dictionary, transition Dictionary, Noun emotion Dictionary, etc. These dictionaries are mainly used to deal with the interference of non-evaluative sentences in evaluative sentences to emotional analysis by useless elements or merely expressing ideas, and to provide lexical support for semantic rules and tendency analysis. (2) on the basis of dependency syntactic analysis, Using semantic role annotation, a series of extraction rules are added for emotional analysis. At the same time, the candidate evaluation objects are extracted by replacing the common noun phrases with fixed middle phrases (phrases composed of attributive and central words) in order to improve the accuracy of the extraction of evaluation objects and opinion words. These rules mainly consider the influence of Chinese semantic knowledge and common sentence patterns on affective analysis. The experimental results show that adding semantic rules to the evaluation corpus of Weibo in NLPCC 2013 can significantly improve the performance of evaluation object extraction. (3) an evaluation object search method is proposed. It is used to improve the accuracy of searching for real evaluation objects in the context of searching without evaluating objects in pronouns or syntactic relations. This method mainly combines word meaning and word similarity calculation algorithm, and reduces the search range of potential evaluation object in context. The experimental results show that the method improves the extraction accuracy of the evaluation object on the experimental corpus.
【学位授予单位】:东南大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1
本文编号:2416882
[Abstract]:With the development of the Internet, a large number of texts containing views and comments have emerged. People browse the comments of others and share their views and feelings about certain people or things. Emotional analysis can excavate group views from comments on the Internet, which plays an extremely important role in guiding economic development, political decision-making and individual behavior. There are two kinds of affective analysis: coarse-grained and fine-grained. At present, coarse-grained affective analysis has achieved good results, but the effect of fine-grained affective analysis is still not ideal. Evaluation object extraction and affective orientation analysis are important sub-tasks of fine-grained emotional analysis. Evaluation object extraction is the bottleneck to improve the performance of the task. There are four main methods for evaluation object extraction, one is based on finding frequently occurring nouns and noun phrases, the other is based on the relationship between viewpoint words and evaluation objects, and the supervised learning is used to extract. The method of using topic model to extract. At present, many methods of extracting the relationship between viewpoint words and evaluation objects often find it difficult to accurately extract the evaluation objects that are really related to the opinion words, especially when the evaluation objects and the opinion words are not in the same clause. In order to improve the performance of emotional analysis, this paper aims to improve the performance of affective analysis by adding extraction rules and search algorithms based on the lexical dependencies in Chinese comment sentences and semantic role tagging. The main work of this paper is as follows: (1) on the basis of the existing dictionaries, we construct the emotional dictionaries for emotional analysis, including: positive emotion dictionaries, negative emotion dictionaries, positive evaluation dictionaries, negative evaluation dictionaries, viewpoint citing dictionaries. Subjunctive mood Dictionary, transition Dictionary, Noun emotion Dictionary, etc. These dictionaries are mainly used to deal with the interference of non-evaluative sentences in evaluative sentences to emotional analysis by useless elements or merely expressing ideas, and to provide lexical support for semantic rules and tendency analysis. (2) on the basis of dependency syntactic analysis, Using semantic role annotation, a series of extraction rules are added for emotional analysis. At the same time, the candidate evaluation objects are extracted by replacing the common noun phrases with fixed middle phrases (phrases composed of attributive and central words) in order to improve the accuracy of the extraction of evaluation objects and opinion words. These rules mainly consider the influence of Chinese semantic knowledge and common sentence patterns on affective analysis. The experimental results show that adding semantic rules to the evaluation corpus of Weibo in NLPCC 2013 can significantly improve the performance of evaluation object extraction. (3) an evaluation object search method is proposed. It is used to improve the accuracy of searching for real evaluation objects in the context of searching without evaluating objects in pronouns or syntactic relations. This method mainly combines word meaning and word similarity calculation algorithm, and reduces the search range of potential evaluation object in context. The experimental results show that the method improves the extraction accuracy of the evaluation object on the experimental corpus.
【学位授予单位】:东南大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1
【参考文献】
相关期刊论文 前2条
1 周红照;侯明午;颜彭莉;张叶青;侯敏;滕永林;;语义特征在评价对象抽取与极性判定中的作用[J];北京大学学报(自然科学版);2014年01期
2 张莉;钱玲飞;许鑫;;基于核心句及句法关系的评价对象抽取[J];中文信息学报;2011年03期
,本文编号:2416882
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2416882.html