当前位置:主页 > 科技论文 > 软件论文 >

面向产品评论的细粒度情感分析

发布时间:2018-06-02 05:28

  本文选题:细粒度情感分析 + 长短期记忆网络 ; 参考:《哈尔滨工业大学》2017年硕士论文


【摘要】:随着互联网电商平台的不断发展,网络购物逐渐成为越来越多网民的购物方式。与此同时,网络上针对产品的大量评论也随之涌现,这些评论既成为了网民购物时的参考,也为厂商了解用户反馈提供了便利的条件。不过,大规模的评论无法全部进行人工审阅,导致用户和厂商无法全面了解一款产品的大众评价。受益于计算机处理能力的增长和大数据时代的来临,自然语言处理技术作为人工智能重要的研究与应用领域,已经在诸多领域发挥了不可替代的作用。计算机拥有处理大规模数据的能力,同时成本低、效率高,如果计算机能够自动帮助用户和厂商分析产品评论中用户所表达的态度,将节省大量的人力物力。文本情感分析是自然语言领域中非常重要的研究方向之一。文本情感分析算法能够自动从篇章或句子中分析出用户的态度,比如支持、反对、或是中性;甚至能够分析出用户的情绪,比如喜悦、悲伤、惊奇等。但是,篇章级别与句子级别的情感分析通常无法找到用户所表达的态度的对象。在对产品评论的分析中,我们不仅对用户的态度感兴趣,更想了解用户对产品的哪一方面表达出了肯定或不满的态度。细粒度情感分析则能够很好地解决这个问题,找出用户评论中的评价对象与评价词、并确定它们之间的搭配关系是细粒度情感分析最重要的步骤。本文首先探究了基于循环神经网络的序列标注方法。这一方法将评价对象和评价词的提取工作看作是序列标注任务,通过对句子中每一个词语打标签的方式,确定哪些词语是评价对象,哪些词语是评价词。此外,还需要确定评价对象与评价词之间的搭配关系。本文使用了两种关系分类的方法,分别为基于句法与语义信息核函数的分类方法和基于融合句法关系的神经网络的分类方法。这两种方法均与句法关系相结合,充分利用词与词之间的句法关系,确定出两个词之间是否为正确的搭配关系,进而对评价对象和评价词进行抽取。实验结果表明,三种算法在各自的任务上均非常有效。在词语抽取的任务上,基于循环神经网络的序列标注方法要优于基于规则的词语抽取算法。在搭配关系抽取的任务上,融合了句法结构信息的模型的性能得到了明显的提高,说明了句法结构在关系分类任务上的有效性。同时,基于卷积神经网络与递归神经网络的混合模型能够更好地对句子的语义信息进行建模,性能更加突出。
[Abstract]:With the continuous development of the Internet e-commerce platform, online shopping has gradually become more and more Internet users' shopping. At the same time, a large number of comments on the products have emerged on the Internet. These comments have not only become the reference of the Internet users, but also provide the convenience for the manufacturers to understand the feedback of the users. All the methods of manual review have led to the failure of users and manufacturers to fully understand the public evaluation of a product. Benefiting from the growth of computer processing capacity and the coming of the era of large data, Natural Language Processing technology has played an irreplaceable role in many fields as an important research and application field of artificial intelligence. It has the ability to process large-scale data with low cost and high efficiency. If the computer can automatically help users and vendors to analyze the attitude expressed by users in the product reviews, it will save a lot of manpower and material resources. Text emotion analysis is one of the most important research directions in the field of natural language. A user's attitude, such as support, opposition, or neutrality, is analyzed from a text or sentence, such as the emotion of the user, such as joy, sadness, surprise, etc., but the emotional analysis of the text level and the sentence level is usually unable to find the object of the user's attitude. In the analysis of the product review, we are not only to the user. The attitude of the user is more interested in understanding which aspect of the product is positive or dissatisfied. Fine-grained emotion analysis can solve the problem well, find the evaluation objects and evaluation words in the user reviews, and determine the collocation relationship between them is the most important step in the fine granularity emotional analysis. The method of sequence Tagging Based on recurrent neural network is studied. This method regards the extraction of evaluation objects and evaluation words as sequence tagging tasks. By labeling each word in the sentence, it determines which words are evaluation objects and which words are evaluation words. In addition, it is necessary to determine the relationship between the evaluation object and the evaluation word. Two kinds of relationship classification methods are used in this paper, the classification method based on syntactic and semantic information kernel function and the classification method of neural network based on syntactic relations. These two methods are combined with syntactic relations, make full use of the syntactic relationship between words and words, and determine whether the two words are positive or not. The experimental results show that the three algorithms are very effective on their respective tasks. On the task of word extraction, the sequence tagging method based on recurrent neural network is superior to the rule based word extraction algorithm. The syntax is syntactically integrated on the task of the collocation relationship extraction. The performance of the structure information model has been greatly improved, which shows the validity of the syntactic structure in the relationship classification task. At the same time, the hybrid model based on the convolution neural network and the recurrent neural network can better model the semantic information of the sentence, and the performance is more outburst.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1

【参考文献】

相关期刊论文 前1条

1 赵妍妍;秦兵;刘挺;;文本情感分析[J];软件学报;2010年08期



本文编号:1967586

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1967586.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户e7eed***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com