基于遗传算法的跨领域产品评论的虚假性分析研究
发布时间:2018-06-04 06:43
本文选题:虚假评价 + 跨领域 ; 参考:《云南大学》2016年硕士论文
【摘要】:随着网络电子商务的逐步成熟,网上购物成为了许多人的消费选择。同时产品的评价会影响人们购买产品的决策,从而导致卖家为了提高产品的销售量或打击竞争对手故意编造一些虚假评价。因此,虚假评论分析研究成为目前文本情感分析的一个重要研究内容。然而,目前的虚假评论分析的复杂度高且识别准确度较低;其次,标注数据缺乏或者很少时,虚假分析是比较困难的。因而,本文基于迁移学习思想、遗传算法和图谱技术对跨领域的虚假评论进行分析研究。第一,针对跨领域的虚假产品评论,本文基于遗传算法从已知的源领域虚假评论中选择最优特征集。首先,根据虚假评论的虚假特征,对评论进行数字化处理。其次,论文对结构化的评论数据进行染色体基因的编码,基于逻辑回归构建适应度函数和遗传算法选择最优的特征集。最优特征集合的选择为降低虚假评论分析的复杂度提供支持。最后,本文通过实验分析了真实评价与虚假评价在特征上存在的差异。第二,基于最优的特征集,本文提出了基于迁移学习的跨领域虚假评论识别方法。该方法根据已知领域与未知领域间文档相似度,定义二者的关联,再结合图谱技术训练情感分类器,并识别出未知领域的虚假评论。实验结果证明出该算法对识别虚假评价上是可行且具有一定的优势。第三,基于本文提出的方法,本文设计并实现了虚假评价信息识别的原型系统,为进一步研究虚假评论信息的识别方法提供了一个平台并且为后续研究虚假评论信息的识别方法奠定基础。
[Abstract]:With the gradual maturity of e-commerce, online shopping has become the consumer choice of many people. At the same time, the evaluation of products will affect people's decision to buy products, which will lead to sellers deliberately fabricate some false evaluation in order to increase the sales of products or attack competitors. Therefore, the research of false comment analysis has become an important research content of text emotion analysis. However, the current analysis of false comments has high complexity and low recognition accuracy. Secondly, when the labeled data is scarce or less, the false analysis is more difficult. Therefore, based on the idea of transfer learning, genetic algorithm and map technology, this paper analyzes and studies the false comments across domains. First, for cross-domain false product reviews, this paper selects the optimal feature collection from known source domain false reviews based on genetic algorithm. Firstly, according to the false features of false comments, the comments are processed digitally. Secondly, the structured comment data is encoded by chromosome gene, and the fitness function and genetic algorithm are constructed based on logical regression to select the optimal feature set. The selection of optimal feature sets provides support for reducing the complexity of false comment analysis. Finally, this paper analyzes the differences between true evaluation and false evaluation through experiments. Secondly, based on the optimal feature set, this paper proposes a cross-domain false comment recognition method based on transfer learning. According to the document similarity between known domain and unknown domain, this method defines the relationship between them, and then combines the graph technique to train the emotion classifier, and to recognize the false comment of unknown domain. The experimental results show that the algorithm is feasible and has some advantages in identifying false evaluation. Thirdly, based on the method proposed in this paper, a prototype system of false evaluation information recognition is designed and implemented. It provides a platform for further research on the identification method of false comment information and lays a foundation for further research on the identification method of false comment information.
【学位授予单位】:云南大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1;TP18
,
本文编号:1976388
本文链接:https://www.wllwen.com/jingjilunwen/dianzishangwulunwen/1976388.html