中文微博情感分类的研究
发布时间:2019-06-22 19:08
【摘要】:随着互联网的发展,在Web2.0时代的主流网络社交平台中,微博已经是被广大互联网用户使用最频繁的社交工具。它因为书写简短,发布方便,以及能进行实时互动等诸多特点受到大众的欢迎。用户逐渐倾向使用微博工具来向外界分享自己的内容,进而来表达自己的观点、看法和情绪。微博影响力的日益扩大,也吸引了大批研究者的关注,其中对微博进行情感分类就是相关领域中一个重要的研究方向。当前关于英文微博的情感分类的研究比较充分,而有关中文微博的情感分类研究还处于初始阶段。中文微博用户日益增多,微博已经开始影响国人的方方面面,所以开展中文微博的情感分类研究显得非常重要和紧迫。 情感分类研究主要是通过分析和挖掘文本中带有情感性的主观性内容,以此对文本的情感所属类别做出判断。本文将分析中文微博本身具有的特征,在传统文本情感分类已有相关理论和方法上,对中文微博的情感分类进行研究。在中文微博文本的主客观分类研究中,论文提出一种基于词典与语料结合的中文微博主观句抽取方法,首先通过一个高可信的情感词典抽取句子中的情感表达文本,以保证结果的准确率;而后基于句子2-POSW模型通过语料学习的方法抽取句子中的剩余情感表达文本,从而提高了召回率。在中文微博文本的情感极性分类研究中,论文首先抽取出中文微博中的主观句部分,然后参考微博表情标注结果和高可信情感词典标注结果,构建了中文微博的情感极性语料库,在保证了语料库规模的同时确保了标注的质量,并减轻了人工标注的负担。在建立的微博情感极性语料库的基础上,抽取情感词和情感短语特征,并利用频度和信息熵进行优化,并结合情感词典特征以及标点符号特征,进行了中文微博情感极性分类的实验。实验结果表明,在中文微博的主客观分类中,相比于传统的基于大规模情感词典的方法,本文方法的F值提高了7%。在中文微博的情感极性分类中,本文提出的优化方法的F值可以达到81.9918%,取得了较好的实验结果。
[Abstract]:With the development of the Internet, Weibo has been the most frequently used social tool by the majority of Internet users in the mainstream social platform of the Web2.0 era. It is welcomed by the public because of its short writing, convenient release, and real-time interaction. Users tend to use Weibo tools to share their content with the outside world, and then to express their views, views and emotions. The increasing influence of Weibo has also attracted the attention of a large number of researchers, among which the emotional classification of Weibo is an important research direction in related fields. At present, the research on emotional classification of English Weibo is relatively sufficient, but the research on emotional classification of Chinese Weibo is still in its initial stage. With the increasing number of Chinese Weibo users, Weibo has begun to affect all aspects of the Chinese people, so it is very important and urgent to carry out the research on emotional classification of Chinese Weibo. The research of emotion classification mainly analyzes and excavates the subjective content with emotion in the text, so as to judge the category of emotion in the text. This paper will analyze the characteristics of Chinese Weibo itself, and study the emotional classification of Chinese Weibo in the existing theories and methods of traditional text emotion classification. In the subjective and objective classification of Chinese Weibo text, this paper proposes a Chinese Weibo subjective sentence extraction method based on dictionary and corpus. Firstly, the emotional expression text in the sentence is extracted through a highly credible emotional dictionary to ensure the accuracy of the result; then, based on the sentence 2-POSW model, the remaining emotional expression text in the sentence is extracted by corpus learning, thus improving the recall rate. In the research of emotional polarity classification of Chinese Weibo text, firstly, the subjective sentence part of Chinese Weibo is extracted, and then the emotional polarity corpus of Chinese Weibo is constructed by referring to the results of Weibo expression tagging and highly trusted emotion dictionary, which not only ensures the scale of corpus, but also ensures the quality of tagging and lightens the burden of manual tagging. Based on the Weibo emotional polarity corpus, the emotional words and emotional phrase features are extracted, and the frequency and information entropy are used to optimize the emotional polarity classification. Combined with the emotional dictionary features and punctuation symbol features, the experiment of Chinese Weibo emotional polarity classification is carried out. The experimental results show that the F value of this method is 7% higher than that of the traditional method based on large-scale emotion dictionary in the subjective and objective classification of Chinese Weibo. In the emotional polarity classification of Chinese Weibo, the F value of the optimization method proposed in this paper can reach 81.9918%, and good experimental results are obtained.
【学位授予单位】:华东师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP391.1;TP393.092
[Abstract]:With the development of the Internet, Weibo has been the most frequently used social tool by the majority of Internet users in the mainstream social platform of the Web2.0 era. It is welcomed by the public because of its short writing, convenient release, and real-time interaction. Users tend to use Weibo tools to share their content with the outside world, and then to express their views, views and emotions. The increasing influence of Weibo has also attracted the attention of a large number of researchers, among which the emotional classification of Weibo is an important research direction in related fields. At present, the research on emotional classification of English Weibo is relatively sufficient, but the research on emotional classification of Chinese Weibo is still in its initial stage. With the increasing number of Chinese Weibo users, Weibo has begun to affect all aspects of the Chinese people, so it is very important and urgent to carry out the research on emotional classification of Chinese Weibo. The research of emotion classification mainly analyzes and excavates the subjective content with emotion in the text, so as to judge the category of emotion in the text. This paper will analyze the characteristics of Chinese Weibo itself, and study the emotional classification of Chinese Weibo in the existing theories and methods of traditional text emotion classification. In the subjective and objective classification of Chinese Weibo text, this paper proposes a Chinese Weibo subjective sentence extraction method based on dictionary and corpus. Firstly, the emotional expression text in the sentence is extracted through a highly credible emotional dictionary to ensure the accuracy of the result; then, based on the sentence 2-POSW model, the remaining emotional expression text in the sentence is extracted by corpus learning, thus improving the recall rate. In the research of emotional polarity classification of Chinese Weibo text, firstly, the subjective sentence part of Chinese Weibo is extracted, and then the emotional polarity corpus of Chinese Weibo is constructed by referring to the results of Weibo expression tagging and highly trusted emotion dictionary, which not only ensures the scale of corpus, but also ensures the quality of tagging and lightens the burden of manual tagging. Based on the Weibo emotional polarity corpus, the emotional words and emotional phrase features are extracted, and the frequency and information entropy are used to optimize the emotional polarity classification. Combined with the emotional dictionary features and punctuation symbol features, the experiment of Chinese Weibo emotional polarity classification is carried out. The experimental results show that the F value of this method is 7% higher than that of the traditional method based on large-scale emotion dictionary in the subjective and objective classification of Chinese Weibo. In the emotional polarity classification of Chinese Weibo, the F value of the optimization method proposed in this paper can reach 81.9918%, and good experimental results are obtained.
【学位授予单位】:华东师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP391.1;TP393.092
【相似文献】
相关硕士学位论文 前3条
1 朱海欢;中文微博情感分类的研究[D];华东师范大学;2014年
2 林江豪;中文微博情感分析关键技术研究[D];广东外语外贸大学;2013年
3 彭蔚U,
本文编号:2504882
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2504882.html