当前位置:主页 > 科技论文 > 软件论文 >

基于微博的情感倾向分析系统的研究与实现

发布时间:2018-05-27 04:36

  本文选题:情感分类 + 情感倾向 ; 参考:《北京邮电大学》2016年硕士论文


【摘要】:近年来,互联网飞速发展,社交网站已经成为人们表达观点的主要平台。微博作为其中热门的网站之一,每天都会产生大量的用户行为数据,这些数据对很多领域都具有研究价值。情感倾向分析是当下热门的研究领域之一,它使用统计学和机器学习方法对用户行为数据进行分析和挖掘,并通过分析结果预测用户的情感态度。本文主要研究和实现了针对微博文本的情感分析系统,具体内容包括以下六个方面:第一,研究了常用的情感分析算法,包括支持向量机算法、朴素贝叶斯算法、Adaboost算法以及神经网络算法。研究了四种算法的原理以并对四种算法进行了分析比较。第二,研究了微博平台页面布局,设计了分布式微博爬虫系统。本系统主要爬取微博热门话题数据,包括微博正文和微博评论。第三,设计了数据预处理系统,并定义了数据预处理的三种规则,包括表情数据转化规则、数据去重规则以及无效数据清洗规则。第四,分析了微博文本数据特点,并针对其特点选择文本特征提取方法。本文主要使用卡方检验方法和TF-IDF方法对微博文本提取和表示特征。第五,使用上述分类算法中的前三种构建微博文本分类器,将微博文本分成正向、负向和中性三类,同时对三种算法分类结果进行了比较和分析。第六,设计并实现了一个展示系统,获取话题数据并通过WEB进行展示。最后,本文基于微博话题数据,对情感分析系统进行了测试,结果表明系统在微博情感预测中表现出较好的效果。
[Abstract]:In recent years, with the rapid development of the Internet, social networking sites have become the main platform for people to express their views. As one of the most popular websites, Weibo produces a lot of user behavior data every day. Affective tendency analysis is one of the most popular research fields. It uses statistics and machine learning methods to analyze and mine user behavior data and predict the emotional attitude of users through the analysis results. This paper mainly studies and implements the emotion analysis system for Weibo text. The specific contents include the following six aspects: first, the commonly used affective analysis algorithms, including support vector machine algorithm, are studied. Naive Bayes algorithm, Adaboost algorithm and neural network algorithm. The principle of four algorithms is studied, and the four algorithms are analyzed and compared. Secondly, the page layout of Weibo platform is studied, and the distributed Weibo crawler system is designed. This system mainly crawls Weibo hot topic data, including Weibo text and Weibo comment. Thirdly, the data preprocessing system is designed, and three rules of data preprocessing are defined, including expression data transformation rule, data de-reduplication rule and invalid data cleaning rule. Fourthly, this paper analyzes the characteristics of Weibo text data, and selects a text feature extraction method according to its characteristics. This paper mainly uses chi-square test method and TF-IDF method to extract and represent Weibo text. Fifthly, Weibo text classifier is constructed by using the first three classification algorithms, and the Weibo text is divided into three categories: forward, negative and neutral. At the same time, the classification results of the three algorithms are compared and analyzed. Sixth, a display system is designed and implemented to obtain topic data and display it through WEB. Finally, based on the topic data of Weibo, this paper tests the affective analysis system, and the results show that the system has a good effect in the prediction of Weibo emotion.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.1;TP393.092

【参考文献】

相关期刊论文 前3条

1 刘全超;黄河燕;冯冲;;基于多特征微博话题情感倾向性判定算法研究[J];中文信息学报;2014年04期

2 黄承慧;印鉴;侯f ;;一种结合词项语义信息和TF-IDF方法的文本相似度量方法[J];计算机学报;2011年05期

3 杜伟夫;谭松波;云晓春;程学旗;;一种新的情感词汇语义倾向计算方法[J];计算机研究与发展;2009年10期



本文编号:1940469

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1940469.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户edf96***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com