基于情感分析的公交舆情分析系统研发及应用
发布时间:2018-04-29 18:14
本文选题:情感分析 + 极性分类 ; 参考:《浙江大学》2017年硕士论文
【摘要】:随着国务院总理李克强在政府报告工作中对"互联网+"概念的提出,目前传统行业正在寻求突破与创新,将自身主营业务与互联网结合起来,实现企业再创造的价值边际效应。作为传统行业的公交产业,也希望能通过互联网、大数据平台等工具对流量、舆情信息进行智能的采集与梳理,优化目前公交产业的一些问题。因此,基于人工智能的情感分析在其中扮演了重要的角色。目前,情感分析的研究已经相对比较成熟,但现有情感分析技术在实际行业应用中多采用基于监督的方式,这种方法正确率较高,但移植性较低,并且人力成本高。而基于无监督方式的情感分析精度虽然有所降低,但却能弥补以上不足,他能减少在数据量较少情况下建模的不准确性,并且快速应用于新的领域并展现出较好的效果。因此,本文基于公交舆情的特点对原有的基于无监督的情感分析技术进行了研究与改进,具体的研究工作包括以下几方面:1.提出了基于Word2Vec的情感词典扩建方法,结合词语的领域性和语义信息后尽可能多的覆盖使用率较高的情感词汇。2.建立了新的适于长文本的文本表示模型——RPFLO模型,实现了情感词与其评价对象的对应关系,揭示了长文本中句子顺序和隐藏在句子间的语义关系。3.提出了基于RPFLO模型的事件主体抽取方法,该方法利用公共子串自动化抽取初始簇中心来改进K-means聚类算法。4.提出了基于改进的Cure算法的相似话题聚类方法。首先对离群点预处理,其次引入不可达类,实现了传统聚类算法无法实现的过程自动终止功能,提高算法的效率。通过研究改进,本论文实现了无监督情感分析模式的优化,在保证高移植性和低人力成本的基础上大幅提高分析精度。
[Abstract]:With Premier Li Keqiang of the State Council putting forward the concept of "Internet" in the government report, at present, traditional industries are seeking to break through and innovate, combining their main business with the Internet. To realize the marginal value effect of enterprise re-creation. As a traditional industry, the public transport industry also hopes to use the Internet, big data platform and other tools to intelligently collect and comb the information of traffic and public opinion, and optimize some problems of the current public transport industry. Therefore, the emotional analysis based on artificial intelligence plays an important role in it. At present, the research of affective analysis has been relatively mature, but the existing affective analysis technology is mostly based on supervision in the application of the actual industry, this method has a high correct rate, but low portability, and high labor cost. Although the accuracy of emotion analysis based on unsupervised method is reduced, it can make up for the above shortcomings. It can reduce the inaccuracy of modeling in the case of less data, and quickly apply it to new fields and show better results. Therefore, based on the characteristics of public opinion, this paper studies and improves the original unsupervised emotion analysis technology. The specific research work includes the following aspects: 1. This paper proposes an extension method of affective dictionary based on Word2Vec, which covers as many affective words as possible after combining the domain and semantic information of words. A new text representation model suitable for long text, RPFLO model, is established. The corresponding relationship between emotional words and their evaluation objects is realized, and the sentence order and the semantic relationship between sentences hidden in long text are revealed. In this paper, an event subject extraction method based on RPFLO model is proposed. The method uses common substring to automatically extract the initial cluster center to improve the K-means clustering algorithm .4. A similar topic clustering method based on improved Cure algorithm is proposed. Firstly, the outlier is pretreated, and then the unreachable class is introduced to realize the function of automatic process termination which can not be realized by the traditional clustering algorithm, so as to improve the efficiency of the algorithm. Through research and improvement, this paper realizes the optimization of unsupervised emotional analysis mode, and improves the precision of analysis on the basis of ensuring high portability and low labor cost.
【学位授予单位】:浙江大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
【参考文献】
相关期刊论文 前10条
1 陈玉琢;;“互联网+”经济模式下新媒体从业人员统战工作创新研究[J];湖北函授大学学报;2016年23期
2 相若晨;孙美凤;;基于词向量与句法树的中文句子情感分析[J];计算机与现代化;2016年08期
3 ;看点!第38次《中国互联网络发展状况统计报告》[J];科学家;2016年09期
4 李天彩;王波;毛二松;席耀一;;基于Skip-gram模型的微博情感倾向性分析[J];计算机应用与软件;2016年07期
5 苏莹;张勇;胡珀;涂新辉;;基于朴素贝叶斯与潜在狄利克雷分布相结合的情感分析[J];计算机应用;2016年06期
6 伊马木·达吾提;何炎祥;刘续乐;;基于主谓情感差异性句法分析框架的跨语言情感分析[J];小型微型计算机系统;2016年03期
7 陈夏芳;曹春萍;;改进的谱聚类算法在文本情感分析中的应用[J];信息技术;2015年12期
8 杜锐;朱艳辉;田海龙;刘t,
本文编号:1821030
本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/1821030.html
最近更新
教材专著