基于非参数贝叶斯方法的情感主题模型构建
发布时间:2018-07-06 10:52
本文选题:情感分析 + 细粒度 ; 参考:《西南科技大学》2016年硕士论文
【摘要】:随着近几年微博、博客、电子商务网站的兴起,用户的参与度和活跃度越来越高,针对热销商品、热门新闻事件等产生了海量的评论信息。通过对这些文本进行数据挖掘研究,可以得到用户对于产品的评价、对社会事件的观点,对于商家的产品研发、用户的购买决策和政府的舆情监控以及政策制定有着重要的价值和意义。因此,分析处理这些文本信息变得迫在眉睫,文本情感分析就是其中主要工作。本文对细粒度的情感分析进行了研究,结合非参数贝叶斯方法,提出了一种面向产品属性的用户情感模型。主要的研究内容包括以下几个方面:首先,研究传统情感模型在分析商品评论中的用户情感时,发现面临两个主要问题:缺乏针对产品属性的细粒度情感分析和自动提取的产品属性其数量须提前确定。接着,提出了一种细粒度的面向产品属性的用户情感模型,首先利用分层狄利克雷过程将名词实体聚类形成产品属性并自动获取其数量,然后结合产品属性中名词实体的权重和评价短语以及情感词典作为先验,利用潜在狄利克雷分布对产品属性进行情感分类。最后,通过采集淘宝和京东关于手机的评论数据,选取苹果手机评论作为实验数据集。实验结果表明该模型具有较高的情感分类准确率,情感分类平均准确率达87%。该模型与传统的情感模型相比在抽取产品属性和评价短语的情感分类上具有较高的准确率。
[Abstract]:With the rise of Weibo, blog and e-commerce websites, the participation and activity of users are becoming higher and higher in recent years. Through the data mining research on these texts, we can get the evaluation of the product, the viewpoint of the social event, the product research and development of the business. User's purchase decision and government's public opinion monitoring and policy making have important value and significance. Therefore, analysis and processing of these text information become urgent, text emotional analysis is one of the main work. In this paper, the fine-grained emotion analysis is studied, and a user emotion model for product attributes is proposed based on the non-parametric Bayesian method. The main research contents include the following aspects: firstly, the traditional emotional model is used to analyze the user emotion in commodity reviews. It is found that there are two main problems: the lack of fine-grained emotional analysis for product attributes and the number of product attributes that need to be determined in advance. Then, a fine-grained user emotion model for product attributes is proposed. Firstly, the noun entities are clustered into product attributes and the number of product attributes is obtained automatically by using the hierarchical Drickley process. Then, combining the weight of the noun entity in the product attribute, the evaluation phrase and the emotion dictionary as a priori, we use the potential Delikley distribution to classify the product attribute. Finally, through collecting the data of Taobao and JingDong about the mobile phone, the author selects the comment of Apple phone as the experimental data set. The experimental results show that the model has a high accuracy of emotion classification, and the average accuracy of emotion classification is 87. Compared with the traditional emotion model, this model has a higher accuracy in extracting product attributes and evaluating phrase classification.
【学位授予单位】:西南科技大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1
,
本文编号:2102546
本文链接:https://www.wllwen.com/jingjilunwen/dianzishangwulunwen/2102546.html