面向标记分布学习的标记增强
发布时间:2018-10-12 08:41
【摘要】:多标记学习(multi-label learning,MLL)任务处理一个示例对应多个标记的情况,其目标是学习一个从示例到相关标记集合的映射.在MLL中,现有方法一般都是采用均匀标记分布假设,也就是各个相关标记(正标记)对于示例的重要程度都被当作是相等的.然而,对于许多真实世界中的学习问题,不同相关标记的重要程度往往是不同的.为此,标记分布学习将不同标记的重要程度用标记分布来刻画,已经取得很好的效果.但是很多数据中却仅包含简单的逻辑标记而非标记分布.为解决这一问题,可以通过挖掘训练样本中蕴含的标记重要性差异信息,将逻辑标记转化为标记分布,进而通过标记分布学习有效地提升预测精度.上述将原始逻辑标记提升为标记分布的过程,定义为面向标记分布学习的标记增强.首次提出了标记增强这一概念,给出了标记增强的形式化定义,总结了现有的可以用于标记增强的算法,并进行了对比实验.实验结果表明:使用标记增强能够挖掘出数据中隐含的标记重要性差异信息,并有效地提升MLL的效果.
[Abstract]:The multi-tag learning (multi-label learning,MLL) task deals with a case where an example corresponds to multiple tags, the goal of which is to learn a mapping from an example to a collection of related tags. In MLL, the existing methods generally adopt the assumption of uniform label distribution, that is, the importance of each relevant marker (positive marker) to the example is considered to be equal. However, for many real-world learning problems, the importance of different related markers is often different. For this reason, marker distribution learning describes the importance of different markers by label distribution, and has achieved good results. But a lot of data contains only simple logical tags, not tag distributions. In order to solve this problem, we can transform logical markers into tag distribution by mining the difference information of marker importance contained in training samples, and then effectively improve the prediction accuracy through label distribution learning. The above process of upgrading the original logical tag to the label distribution is defined as the tag enhancement oriented to the label distribution learning. The concept of tag enhancement is proposed for the first time, the formal definition of tag enhancement is given, and the existing algorithms that can be used for tag enhancement are summarized and compared with each other. The experimental results show that the use of marker enhancement can mine the hidden information of significance difference and improve the effect of MLL effectively.
【作者单位】: 东南大学计算机科学与工程学院;计算机网络和信息集成教育部重点实验室(东南大学);软件新技术与产业化协同创新中心(南京大学);无线通信技术协同创新中心(东南大学);
【基金】:国家自然科学基金优秀青年科学基金项目(61622203) 江苏省自然科学基金杰出青年基金项目(BK20140022)~~
【分类号】:TP301.6
,
本文编号:2265495
[Abstract]:The multi-tag learning (multi-label learning,MLL) task deals with a case where an example corresponds to multiple tags, the goal of which is to learn a mapping from an example to a collection of related tags. In MLL, the existing methods generally adopt the assumption of uniform label distribution, that is, the importance of each relevant marker (positive marker) to the example is considered to be equal. However, for many real-world learning problems, the importance of different related markers is often different. For this reason, marker distribution learning describes the importance of different markers by label distribution, and has achieved good results. But a lot of data contains only simple logical tags, not tag distributions. In order to solve this problem, we can transform logical markers into tag distribution by mining the difference information of marker importance contained in training samples, and then effectively improve the prediction accuracy through label distribution learning. The above process of upgrading the original logical tag to the label distribution is defined as the tag enhancement oriented to the label distribution learning. The concept of tag enhancement is proposed for the first time, the formal definition of tag enhancement is given, and the existing algorithms that can be used for tag enhancement are summarized and compared with each other. The experimental results show that the use of marker enhancement can mine the hidden information of significance difference and improve the effect of MLL effectively.
【作者单位】: 东南大学计算机科学与工程学院;计算机网络和信息集成教育部重点实验室(东南大学);软件新技术与产业化协同创新中心(南京大学);无线通信技术协同创新中心(东南大学);
【基金】:国家自然科学基金优秀青年科学基金项目(61622203) 江苏省自然科学基金杰出青年基金项目(BK20140022)~~
【分类号】:TP301.6
,
本文编号:2265495
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2265495.html