一种改进EM算法的跨领域情感分类方法
发布时间:2018-07-17 17:30
【摘要】:监督学习算法是当前进行文本情感分类的主要方法,往往要求训练集与测试集的数据分布相同,然而在实际情况下已标注数据与测试数据常常不属于同一个领域,这种数据分布差异导致文本情感分类准确率下降。针对这一问题,提出了一种基于EM算法的跨领域情感分类方法。首先从多个源领域结合目标领域生成一个情感倾向参考表;其次利用改进的EM算法参考该表迭代调节目标领域分类器的分类结果,直到该结果可以与参考表匹配;最后在公开数据集上与贝叶斯、SVM等主流分类方法进行对比实验。实验结果表明,该方法在一定程度上提高了跨领域情感分类的准确性。
[Abstract]:Supervised learning algorithm is the main method of text emotion classification. It often requires the same data distribution between the training set and the test set. However, in practice, the labeled data and the test data often do not belong to the same field. The difference in data distribution results in a decrease in the accuracy of text affective classification. To solve this problem, a cross-domain emotion classification method based on EM algorithm is proposed. First, an emotional preference reference table is generated from multiple source fields combined with target fields; secondly, the improved EM algorithm is used to iteratively adjust the classification results of the target domain classifier until the result can be matched with the reference table. Finally, a comparison experiment is carried out on the open data set with Bayesian SVM and other mainstream classification methods. Experimental results show that this method improves the accuracy of cross-domain emotion classification to some extent.
【作者单位】: 国家数字交换系统工程技术研究中心;
【基金】:国家科技支撑计划资助项目(2014BAH30B01) 国家自然科学基金创新群体资助项目(61521003);国家自然科学基金资助项目(61379151)
【分类号】:TP391.1
本文编号:2130411
[Abstract]:Supervised learning algorithm is the main method of text emotion classification. It often requires the same data distribution between the training set and the test set. However, in practice, the labeled data and the test data often do not belong to the same field. The difference in data distribution results in a decrease in the accuracy of text affective classification. To solve this problem, a cross-domain emotion classification method based on EM algorithm is proposed. First, an emotional preference reference table is generated from multiple source fields combined with target fields; secondly, the improved EM algorithm is used to iteratively adjust the classification results of the target domain classifier until the result can be matched with the reference table. Finally, a comparison experiment is carried out on the open data set with Bayesian SVM and other mainstream classification methods. Experimental results show that this method improves the accuracy of cross-domain emotion classification to some extent.
【作者单位】: 国家数字交换系统工程技术研究中心;
【基金】:国家科技支撑计划资助项目(2014BAH30B01) 国家自然科学基金创新群体资助项目(61521003);国家自然科学基金资助项目(61379151)
【分类号】:TP391.1
【相似文献】
相关期刊论文 前8条
1 孟勃;朱明;;采用EM算法对粒子滤波跟踪算法进行改进[J];中国图象图形学报;2009年09期
2 李玉玲;;基于边界齐次方列联表棱向量的EM算法[J];中国科技信息;2010年10期
3 王学军;李智勇;王亮亮;;基于自适应EM算法的光学图像海域分割[J];无线电工程;2011年04期
4 龙兴明,周静;基于EM算法的图像小波系数统计研究[J];计算机仿真;2005年06期
5 于林森;张田文;;用于图像分割的滤波EM算法[J];计算机学报;2006年06期
6 赵佳;何小海;陶青川;刘莹;;基于深度变化成像模型的调整EM算法[J];光学技术;2006年03期
7 安永辉;;EM算法的研究及其在文本处理中的应用[J];现代计算机;2013年10期
8 张德喜;黄浩;;一种适合于大数据集处理的混合EM算法[J];计算机应用;2006年08期
相关硕士学位论文 前2条
1 董宝玉;XCT中锥束投影重排算法与EM算法的研究[D];大连理工大学;2005年
2 景丽俊;基于聚类和关联规则的名医临证思维及方药应用规律挖掘方法[D];暨南大学;2011年
,本文编号:2130411
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2130411.html