面向餐馆评论的情感分析关键技术研究

发布时间：2018-06-18 03:02

本文选题：循环神经网络 + LSTM　；参考：《哈尔滨工业大学》2017年硕士论文

【摘要】：随着互联网与电子商务的发展,网上购物、网上订餐等方便快捷的应用日益深入人们的生活,相应地人们在这些平台上发表的评论信息也正在呈指数级的方式增长。这些信息数量庞大,拥有极其重要的研究价值。对这些评论信息进行分析,从中获得消费者对每个评价对象的评价极性,不仅能指导消费者的消费行为,而且有利于商家掌握消费者需求,从而对产品进行改进。本文对餐馆评论领域评价对象的抽取和评价极性判别两个情感分析子任务进行研究,选择效果最好的方法应用于餐馆评论情感分析系统。具体地,本文研究内容如下:首先,研究评价对象的抽取方法。提出基于输出依赖的双向LSTM模型,该模型在LSTM模型的基础之上通过利用两个独立的隐含层来对文本进行双向处理,从而充分利用文本上文和下文中所蕴含的有效特征,同时在输出层之间加入自连接,有效利用输出序列之间存在的依赖关系,并通过加入词性特征、句法特征、情感倾向特征和命名实体识别特征来提升模型的效果。其次,实现了条件随机场方法,主要在特征选择与组合上对模型的效果进行提升。此外,实现了基于BLSTM-CRF的评价对象抽取方法,将BLSTM的输出向量直接送入CRF模型中进行计算,得到最佳输出标签序列。其次,研究评价对象极性判别方法。提出基于双向LSTM的评价对象极性判别模型,该模型利用两个BLSTM网络即BLSTML和BLSTMR来分别收集评价对象的上文和下文语义信息,在每个时间步骤上将当前单词词向量和评价对象向量进行连接后一同送入模型,从而使模型能捕获到每个单词与评价对象之间的语义关系。该模型取得了同类模型的最好效果。此外,本文提出了基于提升的模型融合方法,该方法法将支持向量机模型和随机森林模型融合,在训练完一个分类模型后,增大该模型错误分类的样本所占的权重并减小该模型正确分类的样本的权重,最后按照各模型的效果对结果加权得到最终的结果。该方法做到了将线性分类模型和非线性分类模型的优点结合。最后,设计实现基于餐馆评论的情感分析系统。将基于输出依赖双向LSTM的评价对象抽取方法和基于双向LSTM的评价极性判别方法应用到系统中,提高了系统进行评价对象抽取与极性判别的准确性。该系统能够直观地以饼图的方式将评价对象及评价极性占比形象地表示出来。
[Abstract]:With the development of Internet and electronic commerce, the convenient and fast application of online shopping, online ordering and so on is deepening into people's life. Accordingly, the comments on these platforms are also increasing exponentially. The amount of information is so large that it has extremely important research value. Through the analysis of these comments, the evaluation polarity of each evaluation object can be obtained, which can not only guide the consumer's consumption behavior, but also help the merchant to grasp the consumer's demand and improve the product. In this paper, we study the two sub-tasks of the selection and evaluation polarity of the evaluation objects in the field of restaurant review, and choose the best method to be applied to the restaurant comment emotion analysis system. Specifically, the contents of this paper are as follows: firstly, the extraction method of evaluation objects is studied. A bidirectional LSTM model based on output dependence is proposed. Based on the LSTM model, the two independent hidden layers are used to process the text bidirectional, so as to make full use of the effective features contained in the text above and below. At the same time, self-linking is added between the output layers to effectively utilize the dependency between output sequences, and to improve the effectiveness of the model by adding part-of-speech features, syntactic features, affective tendency features and named entity recognition features. Secondly, the conditional random field method is implemented to improve the performance of the model in feature selection and combination. In addition, the evaluation object extraction method based on BLSTM-CRF is implemented, and the output vector of BLSTM is directly input into the CRF model for calculation, and the optimal output label sequence is obtained. Secondly, the polarity discrimination method of evaluation object is studied. A polarity discriminant model of evaluation object based on bidirectional LSTM is proposed. Two BLSTM networks, BLSTML and BLSTMR, are used to collect the above and the following semantic information of the evaluation object, respectively. The current word vector and the evaluation object vector are linked into the model in each time step, so that the model can capture the semantic relationship between each word and the evaluation object. The model achieves the best effect of the same model. In addition, this paper proposes a model fusion method based on lifting, which combines support vector machine model and stochastic forest model, after training a classification model, The weight of the samples of the model is increased and the weight of the samples classified correctly is reduced. Finally, the final results are obtained by weighting the results according to the effects of each model. This method combines the advantages of linear classification model and nonlinear classification model. Finally, an emotional analysis system based on restaurant reviews is designed and implemented. The evaluation object extraction method based on output-dependent bidirectional LSTM and the evaluation polarity discrimination method based on bidirectional LSTM are applied to the system, which improves the accuracy of evaluation object extraction and polarity discrimination. The system can visualize the evaluation object and the proportion of evaluation polarity by pie chart.
【学位授予单位】：哈尔滨工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.1

【参考文献】