基于文本倾向性分析的民航事件舆情趋势预测方法研究
发布时间:2018-03-14 06:21
本文选题:网络舆情 切入点:垃圾评论识别 出处:《中国民航大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着我国民航业的高速发展,大众对民航行业的关注度越来越高。微博、论坛等新媒体使民航舆情事件被高度关注。网民会借助这些平台发表自己关于民航事件的评论,但网民产生的评论中存在与话题无关,甚至是虚假的垃圾评论,所以在对民航事件分析前,首先需处理垃圾评论。此外,当前网民评论的情感倾向会对未来网民对同一事件的态度产生影响,因此准确客观的对评论进行情感分析并对发展趋势做出预测,对评估民航事件舆情的发展趋势并提前进行应对,是非常重要的。针对垃圾评论的识别和过滤,本文界定了评论是否重复出现、评论中政府部门出现次数等六个指标作为识别垃圾评论的特征。采用信息增益算法对特征进行权重计算,并利用粒子群优化的支持向量机模型(PSO-SVM)进行垃圾评论的识别和过滤。因获取预测指标是网络舆情情感趋势预测的前提,本文提出了不同于以往的单纯热度指标(例如,关注度、评论回复数、转发数等)的评论情感倾向性值时间序列的预测指标。又因情感倾向性值呈现非线性、随机性的特征,本文采用相关向量机模型进行趋势预测来提高精度。本文设计了实验,对文中研究成果做了分析和验证。针对识别和过滤垃圾评论的问题,实验分析了界定垃圾评论的特征数量和不同特征对垃圾评论识别的影响,实验结果说明了选择合适的特征对于垃圾评论识别的重要性。对于情感趋势预测,本文将相关向量机模型、Elman神经网络及BP神经网络模型各自的预测结果进行了对比实验。利用平均绝对误差(MAE)和均方根误差(RMSE)评价预测的准确性。通过对比实验说明,相关向量机的预测性能优于其他两种模型并能更为准确的反映网民对舆情事件的情感趋势。故本文对民航舆情分析中的垃圾评论识别和情感趋势预测的研究是有意义的。
[Abstract]:With the rapid development of the civil aviation industry in China, the public is paying more and more attention to the civil aviation industry. New media such as Weibo, forum and other new media have made civil aviation public opinion events highly concerned. Netizens will use these platforms to make their own comments on civil aviation affairs. However, there are comments generated by Internet users that have nothing to do with the topic, or even false spam comments. Therefore, before analyzing the civil aviation incident, we should first deal with the garbage comments. In addition, The emotional tendency of current netizens' comments will have an impact on the attitude of future netizens to the same event, so accurately and objectively carry out the emotional analysis of the comments and make a prediction of the development trend. It is very important to assess the development trend of public opinion on civil aviation incidents and to deal with it in advance. In view of the identification and filtering of garbage comments, this paper defines whether the comments are repeated. Six indexes, such as the number of government departments appearing in the comments, are used to identify the spam comments. The information gain algorithm is used to calculate the weight of the features. The support vector machine model based on particle swarm optimization (PSO) is used to identify and filter garbage comments. Since obtaining prediction index is the premise of prediction of sentiment trend of network public opinion, this paper proposes a simple heat index (for example, concern degree), which is different from previous ones. The prediction index of the time series of the emotional tendency value of the comment, the response number of comment, the number of retweets, etc., and because of the nonlinear and random characteristics of the emotional tendency value, In this paper, the correlation vector machine model is used to predict the trend to improve the accuracy. Experiments are designed, and the research results are analyzed and verified. The experimental results show the importance of choosing suitable features for garbage comment recognition, and the prediction of emotion trend. In this paper, the correlation vector machine model Elman neural network and the BP neural network model are compared. The accuracy of the prediction is evaluated by using the mean absolute error (mae) and the root mean square error (RMSE). The prediction performance of correlation vector machine is better than the other two models and can more accurately reflect the emotional trend of Internet users' public opinion events. So this paper is meaningful to the garbage comment identification and emotional trend prediction in civil aviation public opinion analysis.
【学位授予单位】:中国民航大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
【参考文献】
相关期刊论文 前10条
1 李猛;刘元宁;;一种基于信息增益的新垃圾邮件特征选择算法[J];吉林大学学报(理学版);2017年02期
2 张代磊;黄大年;张冲;;基于遗传算法优化的BP神经网络在密度界面反演中的应用[J];吉林大学学报(地球科学版);2017年02期
3 昝红英;毕银龙;石金铭;;基于Adaboost算法与规则匹配的垃圾评论识别[J];郑州大学学报(理学版);2017年01期
4 陈婷;王雪怡;曲霏;陈福集;;基于时序主题的网络舆情热点话题演化分析方法[J];华中师范大学学报(自然科学版);2016年05期
5 王振武;孙佳骏;尹成峰;;改进粒子群算法优化的支持向量机及其应用[J];哈尔滨工程大学学报;2016年12期
6 何炎祥;刘健博;孙松涛;;基于神经网络的微博舆情预测方法[J];华南理工大学学报(自然科学版);2016年09期
7 董松月;陈润雨;刘西菩;赵颖莉;马晓宁;;网络民航事件虚假评论的识别研究[J];智能计算机与应用;2016年04期
8 游丹丹;陈福集;;基于改进粒子群和BP神经网络的网络舆情预测研究[J];情报杂志;2016年08期
9 梁f,
本文编号:1610019
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/1610019.html