基于词典的财经微博信息的情感态度挖掘
发布时间:2018-04-05 08:39
本文选题:微博 切入点:情感分类 出处:《浙江师范大学》2014年硕士论文
【摘要】:近年来,随着中国经济的快速发展,中国的股票市场发展也呈现迅猛之势。中国股市已拥有2467家上市公司,沪深股市总市值23.5万亿,股民数量已达到1.6亿,中国股市已经成为全球市值的第三大市场。对股民而言,互联网财经类消息与他们的利益息息相关。 微博作为一种新型的社交工具,由于其简短写作,便捷发布,实时交互的特点深受大众欢迎,微博已成为国内第二大网络社交媒介,也是第二大舆情源头。面向财经类的微博信息分析,关注公众对财经市场的反应——情感,可以为市场预测提供参考,为财经行业从业人员和投资者服务。因此,以财经领域作为研究实例,分析微博舆情有现实意义和应用价值。 在针对财经微博的情感态度分析研究中,构建了一个完整的分类模型,主要从规范化、分类、命名实体识别、情感分析、趋势预测等方面开展研究。但是本文将重心放在情感分析上,情感倾向分类也被称为观点挖掘(Opinion Mining)或者情感极性分类,可以理解为用户对某客体表达自身观点所持的态度是支持、反对、中立,也就是常说的正面情感、负面情感、中性情感。在论文的具体实施过程中,研究的主要内容包括以下几部分: (1)研究了公司组织机构名称全称及简称的语法构成、语义特点及组织规律,并结合金融领域特有的情感词,使用情感倾向点互信息算法(SO-PMI)构建了金融领域词典。 (2)分析研究中文微博的特点,在结合网络语言及金融语言特点的基础上,构建了网络用语词典和否定词、程度副词及表情符词典,对深入研究情感态度挖掘具有重要帮助。 (3)提出了情感加权计算方法,将构建的各类词典应用到情感分类之中,实现情感分类值的量化计算。 最后通过新浪API获取一段时间内含有公司名称的财经微博,在经过预处理、分词和特征选择之后,用词典的情感分类方法对其进行分类。实验验证了金融领域词典、网络词典、和表情词典的重要性,并将各种词典都完备下的实验数据和实际股市走向进行对比,说明实验数据在实际生活中具有现实意义,通过进一步研究可运用于股票投资。
[Abstract]:In recent years, with the rapid development of China's economy, China's stock market is also showing a rapid trend.China's stock market has 2467 listed companies, Shanghai and Shenzhen stock market market value 23.5 trillion, the number of shareholders has reached 160 million, the Chinese stock market has become the world's third-largest market market value.For investors, Internet financial news and their interests are closely linked.Weibo as a new type of social tool, because of its short writing, convenient release, real-time interaction characteristics of popular welcome, Weibo has become the second largest social media in China, but also the second source of public opinion.Weibo's information analysis, focusing on the public's reaction to the financial market, can provide a reference for market forecasting and serve as a service for practitioners and investors in finance and economics.Therefore, take the finance and economics domain as the research example, analysis Weibo public opinion has the realistic significance and the application value.In the study of financial Weibo's affective attitude analysis, a complete classification model is constructed, mainly from standardization, classification, named entity identification, emotional analysis, trend prediction and so on.However, this paper focuses on emotional analysis, which is also called opinion mining or emotional polarity classification, which can be understood as support, opposition and neutrality of the user's attitude towards an object expressing its own views.That is to say, positive emotion, negative emotion, neutral emotion.In the specific implementation of the paper, the main content of the study includes the following parts:(1) this paper studies the grammatical structure, semantic characteristics and organization rules of the full name and abbreviation of company organization, and constructs the financial domain dictionary by using the affective point mutual information algorithm (SO-PMI), which is a special affective word in the financial field.2) analyzing and studying the characteristics of Chinese Weibo, on the basis of combining the characteristics of network language and financial language, this paper constructs a dictionary of network terms, negative words, adverbs of degree and emoji, which is of great help to the further study of emotional attitude mining.(3) an affective weighted calculation method is put forward, and the constructed dictionaries are applied to emotional classification to realize the quantification calculation of emotional classification value.Finally, the financial and economic Weibo with company name was obtained by Sina API for a period of time. After preprocessing, participle and feature selection, it was classified by the emotion classification method of dictionary.The experiment verifies the importance of financial field dictionary, network dictionary and expression dictionary, and compares the experimental data with the trend of real stock market, which shows that the experimental data have practical significance in real life.It can be applied to stock investment through further research.
【学位授予单位】:浙江师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP391.1;TP393.092
【参考文献】
相关期刊论文 前7条
1 章剑锋;张奇;吴立德;黄萱菁;;中文观点挖掘中的主观性关系抽取[J];中文信息学报;2008年02期
2 聂恩伦;陈黎;王亚强;秦湘清;金宇;于中华;;基于K近邻的新话题热度预测算法[J];计算机科学;2012年S1期
3 王文远;王大玲;冯时;李任斐;王琳;;一种面向情感分析的微博表情情感词典构建及应用[J];计算机与数字工程;2012年11期
4 张珊;于留宝;胡长军;;基于表情图片与情感词的中文微博情感分析[J];计算机科学;2012年S3期
5 杨斌;路游;;基于统计学习理论的支持向量机的分类方法[J];计算机技术与发展;2006年11期
6 叶强;张紫琼;罗振雄;;面向互联网评论情感分析的中文主观性自动判别方法研究[J];信息系统学报;2007年01期
7 李俊;陈黎;王亚强;秦湘清;于中华;;面向电子商务网站的产品属性提取算法[J];小型微型计算机系统;2013年11期
,本文编号:1714007
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1714007.html