基于可变最小贝叶斯风险的层次多标签分类方法
发布时间:2018-08-06 20:32
【摘要】:层次多标签分类方法,依据标签之间的相关性组织成层次结构,并将这种层次结构作为一种监督信息,从而更好地解决多标签分类问题.在层次多标签分类问题中常用的方法有两种,一种可称为损失无关方法,另一种可称为损失敏感方法.对于损失敏感方法,常用的损失函数有HMC-loss,该损失函数可对假正和假负两种错误给予不同的权重,并将层次信息添加到损失函数当中.当利用HMC-loss预测时,尽管得到的损失值是理想的,但实际预测的标签数却远多于真实的标签数.另外,层次信息的引入会对标签结点的决策顺序产生不利影响.针对这些问题,首先提出改进的损失函数IMH-loss,其次使用贝叶斯决策理论,提出了一种贝叶斯风险随决策过程可变的层次多标签分类方法.在真实数据集上的实验结果表明,该方法在保证召回率的同时,提升了标签预测精度.
[Abstract]:The hierarchical multi-label classification method is organized into a hierarchical structure according to the correlation between labels, and the hierarchical structure is regarded as a kind of supervisory information to solve the problem of multi-label classification better. There are two commonly used methods in hierarchical multi-label classification, one is loss-independent and the other is loss-sensitive. For loss-sensitive methods, the commonly used loss function is HMC-loss.This loss function can give different weights to false positive and false negative errors, and add hierarchical information to the loss function. When using HMC-loss prediction, although the loss value obtained is ideal, the actual number of tags predicted is much more than the actual number of tags. In addition, the introduction of hierarchical information will adversely affect the decision order of label nodes. To solve these problems, an improved loss function (IMH-lossing) is proposed, and then a hierarchical multi-label classification method of Bayesian risk variable with the decision process is proposed by using Bayesian decision theory. The experimental results on real data sets show that the proposed method not only guarantees the recall rate, but also improves the label prediction accuracy.
【作者单位】: 山西大学计算机与信息技术学院;山西大学计算智能与中文信息处理教育部重点实验室;
【基金】:国家自然科学基金(61632011,61272095,61432011,U1435212,61573231,61672331)
【分类号】:TP301.6
本文编号:2168908
[Abstract]:The hierarchical multi-label classification method is organized into a hierarchical structure according to the correlation between labels, and the hierarchical structure is regarded as a kind of supervisory information to solve the problem of multi-label classification better. There are two commonly used methods in hierarchical multi-label classification, one is loss-independent and the other is loss-sensitive. For loss-sensitive methods, the commonly used loss function is HMC-loss.This loss function can give different weights to false positive and false negative errors, and add hierarchical information to the loss function. When using HMC-loss prediction, although the loss value obtained is ideal, the actual number of tags predicted is much more than the actual number of tags. In addition, the introduction of hierarchical information will adversely affect the decision order of label nodes. To solve these problems, an improved loss function (IMH-lossing) is proposed, and then a hierarchical multi-label classification method of Bayesian risk variable with the decision process is proposed by using Bayesian decision theory. The experimental results on real data sets show that the proposed method not only guarantees the recall rate, but also improves the label prediction accuracy.
【作者单位】: 山西大学计算机与信息技术学院;山西大学计算智能与中文信息处理教育部重点实验室;
【基金】:国家自然科学基金(61632011,61272095,61432011,U1435212,61573231,61672331)
【分类号】:TP301.6
【相似文献】
相关期刊论文 前3条
1 文春勇;朱信忠;徐慧英;赵建民;;基于最小风险的贝叶斯决策理论相关反馈方法[J];计算机应用研究;2009年03期
2 李小光;;混合损失函数支持向量回归机的性能研究[J];西北大学学报(自然科学版);2011年02期
3 路绪清;唐杰;李涓子;蔡月茹;;基于关键词抽取的hypertext自动建立方法[J];计算机科学;2005年02期
相关会议论文 前2条
1 谢世斌;刘万春;朱玉文;;基于贝叶斯决策理论和主成分分析的人脸识别[A];第三届全国数字成像技术及相关材料发展与应用学术研讨会论文摘要集[C];2004年
2 吴佳金;杨志豪;林原;林鸿飞;;基于改进Pairwise损失函数的排序学习方法[A];第六届全国信息检索学术会议论文集[C];2010年
,本文编号:2168908
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2168908.html