网络环境下用户观点挖掘方法研究
发布时间:2018-03-21 17:53
本文选题:微博 切入点:产品评论 出处:《中央民族大学》2016年硕士论文 论文类型:学位论文
【摘要】:在网络环境下,信息和数据量都在飞速增长,在各种信息中,用户观点信息具有着非常重要的作用,在社交网络上对用户观点的挖掘可以用于调查舆情,在电子商务网站上对用户观点的挖掘可以为商家的产品设计和推广提供有价值的参考依据,对用户观点的挖掘必须以大数据为基础,在用户观点信息中含有用户的潜在兴趣,也蕴含着用户的情感状态。本文对基于中文微博产品评论的用户观点挖掘方法展开研究,对发现潜在消费群体有着非常重要的作用,商家可以据此制定更有针对性的产品营销策略,从而真正发挥出大数据挖掘的作用。本文主要包括以下的研究内容:1、详细探讨文本的用户观点分析技术,主要包括篇章级的用户观点分类、句子级的用户观点分类以及词汇的用户观点分类等内容。2、研究各种分类算法的优缺点,并通过实例比较随机森林算法和支持向量机算法在泛化能力、噪声鲁棒性和不平衡分类上的异同。3、构建基于句法依存关系的微博用户观点分析模型和基于文本分类的微博用户观点分析模型,并进行实验分析。4、进行了系统测试,主要包括测试环境的搭建、基于Hadoop平台测试以及测试结果分析等内容。本文创新性地提出基于句法依存关系和文本分类相结合的中文微博用户观点分析算法,测试结果表明,算法的正确率和召回率可以接近90%,较改进前的算法有了较大幅度的提升。实现的用户观点分析系统具有较强的可靠性和较高的灵活性,易于扩展,可以实现海量微博数据的快速筛选,可以将其推广应用于各类社交网络和电子商务网站中。
[Abstract]:In the network environment, the information and the amount of data are increasing rapidly. Among all kinds of information, the user's viewpoint information plays a very important role, and the mining of the user's viewpoint on the social network can be used to investigate the public opinion. The mining of the user's viewpoint on the e-commerce website can provide valuable reference basis for the product design and promotion of the merchant. The mining of the user's viewpoint must be based on big data and contain the potential interest of the user in the information of the user's point of view. This paper studies the mining method of user viewpoint based on Chinese Weibo product review, which plays an important role in discovering potential consumer groups. Based on this, merchants can formulate more targeted product marketing strategies, so that they can really play the role of big data. This paper mainly includes the following research contents: 1, to discuss in detail the user viewpoint analysis technology of the text. It mainly includes user viewpoint classification at text level, user viewpoint classification at sentence level and user view classification of vocabulary. The advantages and disadvantages of various classification algorithms are studied. The generalization ability of stochastic forest algorithm and support vector machine algorithm is compared by examples. The similarities and differences of noise robustness and unbalanced classification. The model of Weibo user viewpoint analysis based on syntactic dependency and the user view analysis model based on text classification are constructed, and the experiment analysis .4is carried out, and the system test is carried out. This paper innovatively proposes a Chinese Weibo user viewpoint analysis algorithm based on syntactic dependency and text classification. The accuracy and recall rate of the algorithm can be close to 90, which is much higher than that of the improved algorithm. The realized user view analysis system has strong reliability, high flexibility and easy to be extended. Weibo data can be quickly filtered and applied to social networks and e-commerce websites.
【学位授予单位】:中央民族大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1
【参考文献】
相关期刊论文 前2条
1 谢丽星;周明;孙茂松;;基于层次结构的多策略中文微博情感分析和特征抽取[J];中文信息学报;2012年01期
2 杜伟夫;谭松波;云晓春;程学旗;;一种新的情感词汇语义倾向计算方法[J];计算机研究与发展;2009年10期
相关会议论文 前1条
1 段秀婷;何婷婷;宋乐;;基于PMI-IR算法的Blog情感分类研究[A];第五届全国青年计算语言学研讨会论文集[C];2010年
,本文编号:1644925
本文链接:https://www.wllwen.com/jingjilunwen/dianzishangwulunwen/1644925.html