基于抽样和规则的不平衡数据关联分类方法
发布时间:2018-05-01 02:19
本文选题:关联分类方法 + 不平衡数据 ; 参考:《系统工程理论与实践》2017年04期
【摘要】:不平衡数据的出现给传统关联分类算法带来了巨大的挑战.为了提高关联分类方法对不平衡数据集的分类精度,本文分别从数据和规则层次着手,提出了关键值抽样法(key value sampling,KVS)和规则验证法(rule validation,RV).关键值抽样法通过增加与少数类相关性强的数据,减少与多数类相关性弱的数据来达到数据类分布平衡.避免了大量有效信息的流失,并且增强了与少数类相关性强的数据信息.规则验证法对初步生成的分类器进行了规则验证,并对分类性能不好的规则进行调整,从而保证了分类器中规则的质量.实验表明,本文中的研究方法能够有效提高关联分类方法处理不平衡数据的精度.
[Abstract]:The appearance of unbalanced data brings a great challenge to the traditional association classification algorithm. In order to improve the classification accuracy of association classification for unbalanced datasets, the key value sampling method (KVS) and rule validation method (RVS) are proposed in this paper from the level of data and rules, respectively. By increasing the data with strong correlation with a few classes and reducing the data with weak correlation with most classes, the key value sampling method achieves the equilibrium of data class distribution. It avoids the loss of a large amount of effective information and enhances the data information which has strong correlation with a few classes. The rule verification method verifies the rule of the initial generated classifier and adjusts the rules with poor classification performance to ensure the quality of the rules in the classifier. The experimental results show that the proposed method can effectively improve the accuracy of the association classification method in dealing with unbalanced data.
【作者单位】: 大连理工大学系统工程研究所;
【基金】:国家自然科学基金(71671024,71421001) 教育部人文社科基金(15YJCZH198) 辽宁经济社会发展立项课题(20161s lktzizzx-01)~~
【分类号】:TP311.13
,
本文编号:1827299
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1827299.html