事件驱动的文本情绪原因发现研究
发布时间:2018-04-23 10:44
本文选题:文本情绪原因发现 + 事件驱动 ; 参考:《哈尔滨工业大学》2017年硕士论文
【摘要】:互联网技术飞速发展的今天,网络空间所包含的大量文本数据中既蕴含着智慧的结晶,又存在着潜在的风险。在此背景下,基于自然语言处理技术的舆情监控、观点抽取和情绪分析等研究显得愈发重要。目前相关研究重点正从日趋成熟的文本情绪分析向挖掘文本中包含的情绪产生原因深入,也就是从“知其然”向“知其所以然”深入,即本文所研究的文本情绪原因发现。文本情绪原因发现研究不仅依赖于所实施的算法,也受到原因标注语料的限制。目前相关语料库的缺乏影响了该领域研究的深入。因此本文首先设计构建一个规模适中的情绪原因标注语料库,并在此基础上研究情绪原因事件驱动的文本情绪原因发现方法。本文的工作主要包括以下三部分:针对标注语料库缺乏的问题,首先设计并构建基于新闻文本的情绪原因语料库。在对情绪原因表达规律进行观察和分析的基础上,设计了一套完整全面的标注体系。遵循这一体系,从15,687篇新闻文档中人工挑选出2,105个包含情绪原因的实例,并完成情绪原因的标注,最终构建了一个情绪原因标注语料库。应用这一语料库,本文研究事件驱动的文本情绪原因发现方法。在对情绪原因文本的表达特点进行分析和观察的基础上,设计了将引发情绪产生和变化的外界刺激抽象为事件元组结构的方法。进而,设计实现了基于依存句法分析的候选情绪原因事件抽取算法以及基于多项式核支持向量机算法的情绪原因事件识别算法。在本文构建语料库上进行的实验显示,该方法在文本情绪原因识别的F值性能相较于基线方法提升3.34%。针对事件元组结构表达能力有限的不足,研究将情绪原因事件元组进一步转换为事件树结构,实现情绪原因从文本到事件树的有效映射。通过结合树核和多项式核,设计实现更有效的情绪原因发现方法。实验结果显示,相比基线系统,该方法的F值提升10.61%。本文提出的事件驱动的情绪原因发现方法,可以很好地实现对情绪原因文本的抽象和映射,在情绪原因发现实验中达到了目前已知方法中的最优效果。同时,本文所建立的中文情绪原因标注语料库作为开放研究资源,也可推动本领域研究的发展和深入。
[Abstract]:With the rapid development of Internet technology, the large amount of text data contained in cyberspace contains not only the crystallization of wisdom, but also the potential risks. In this context, the research of public opinion monitoring, opinion extraction and emotion analysis based on natural language processing technology becomes more and more important. At present, the focus of relevant research is from the increasingly mature text emotional analysis to the exploration of the causes of emotion contained in the text, that is, from "knowing what it is" to "knowing what it is", that is, finding out the emotional reasons of the text studied in this paper. The research of text emotional cause discovery is not only dependent on the algorithm, but also limited by the reason tagging corpus. At present, the lack of corpuscles affects the depth of research in this field. Therefore, this paper first designs and constructs a moderate scale tagging corpus of emotional causes, and on this basis, studies a text method of emotional cause discovery driven by event of emotional cause. The work of this paper mainly includes the following three parts: aiming at the lack of annotated corpus, we first design and construct a corpus of emotional causes based on news texts. Based on the observation and analysis of the expression rules of emotional causes, a complete and comprehensive labeling system is designed. Following this system, 2105 examples containing emotional reasons were selected from 15687 news documents, and the tagging of emotional reasons was completed. Finally, a corpus of emotional cause tagging was constructed. Using this corpus, this paper studies an event-driven approach to the discovery of emotional causes in text. Based on the analysis and observation of the expression characteristics of the emotional cause text, a method is designed to abstract the external stimulus which causes the emotion generation and change into the event tuple structure. Furthermore, we design and implement the candidate emotion cause event extraction algorithm based on dependency syntax analysis and the emotion cause event recognition algorithm based on polynomial kernel support vector machine algorithm. The experiments on the corpus constructed in this paper show that the F-value performance of this method is 3.34 higher than that of the baseline method. Aiming at the limitation of the expression ability of event tuple structure, this paper studies the transformation of emotional cause event tuple into event tree structure to realize the effective mapping of emotional cause from text to event tree. By combining tree kernels with polynomial kernels, a more effective method of emotional cause detection is designed. The experimental results show that the F value of this method is 10.61% higher than that of baseline system. The event-driven method of emotional cause discovery proposed in this paper can well realize the abstraction and mapping of the emotional cause text and achieve the best result of the known methods in the experiment of emotional cause discovery. At the same time, as an open research resource, the Chinese emotional cause tagging corpus established in this paper can also promote the development and development of this field.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
【参考文献】
相关期刊论文 前4条
1 徐睿峰;邹承天;郑燕珍;徐军;桂林;刘滨;王晓龙;;一种基于情绪表达与情绪认知分离的新型情绪词典[J];中文信息学报;2013年06期
2 李逸薇;李寿山;黄居仁;高伟;;基于序列标注模型的情绪原因识别方法[J];中文信息学报;2013年05期
3 何向东;王磊;;中西哲学因果关系研究的回顾及其启示[J];哲学研究;2010年02期
4 袁毓林;用动词的论元结构跟事件模板相匹配——一种由动词驱动的信息抽取方法[J];中文信息学报;2005年05期
,本文编号:1791622
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1791622.html