用于不平衡分类问题的自适应加权极限学习机研究
发布时间:2018-01-07 06:31
本文关键词:用于不平衡分类问题的自适应加权极限学习机研究 出处:《深圳大学》2017年硕士论文 论文类型:学位论文
更多相关文章: 不平衡数据 分类 不平衡分类 极限学习机 加权极限学习机
【摘要】:极限学习机是由新加坡南洋理工大学Huang等人于2006年提出,它是一种单隐层前馈型神经网络(SLFNs)学习算法。这种算法在学习过程中不需要调整网络的输入权值和隐藏层神经元的偏置,只需要设置隐藏层神经元节点的个数。通过使用最小二乘法产生唯一的最优解,极大的提高了SLFNs网络的训练速度,同时在某种程度上降低了过拟合的概率。但是,它依然受到数据分布不平衡的影响。2013年Zong等人在ELM基础上采用加权的方式提出加权极限学习机(WELM)算法,将ELM算法很好的应用在不平衡数据集之上。但是WELM的加权机制是固定的,对于二分类问题,多数类A的样本总数为sumA,少数类B的样本总数为sumB,它选择给A类样本添加1/sumA的权重值,给B类样本添加1/sumB的权重值,这种方式显然不是最优解。本文从三个方面展开工作:第一,探讨了隐含层输出权重对极限学习机处理非平衡分类问题的影响。为了直观的了解非平衡数据集是如何影响极限学习机性能,我们在多个数据集上,通过逐步增加数据集的不平衡比,从试验中发现,极限学习机正是在数据集平衡时取得最优性能,数据的不平衡度对极限学习机的分类效果有着直接的影响。第二,提出了一种新的自适应式隐含层输出加权策略用以改进加权极限学习机的预测表现。加权极限学习机能够有效的提升极限学习机在不平衡数据集上的分类性能,但是其加权机制过于武断。本文从减小错分样本对分类器的影响入手,提出了自适应加权极速学习机(SawELM),全新设计了计算输出层权重的机制。该机制包括以下两个模块:1.逐步减小错分训练样本的权重2.动态更新错分样本的输出层的值。SawELM的第一个模块减少了错误分类样本在计算输出层权重的影响,第二个模块告知SawELM去调整输出层的权值。对WELM分类错误的样本,一方面,在计算输出层权重时弱化错分实例的影响,另一方面,增大错分样本实例的输出,使得错分样本可以被分类器更好的学习。第三,给出了充分的实验比较证实自适应加权极限学习机的可行性和有效性。本文从KEEL数据仓库中随机选取了50个二分类不平衡数据集,分别对比了SawELM,ELM以及WELM的三个指标:准确率、G-mean和F1-measure。实验结果显示新设计的自适应机制是有效的。同时,SawELM显著的提升了WELM的不平衡分类性能。与ELM和WELM相比,SawELM的G-mean,F1-measure二个指标显著提升。与此同时,SawELM的准确率要高于WELM并且与ELM不相上下。
[Abstract]:Extreme learning machine (LLM) was proposed by Huang et al., Nanyang Polytechnic University of Singapore in 2006. It is a single hidden layer feedforward neural network (SLFNs) learning algorithm, this algorithm does not need to adjust the network input weights and hidden layer neural network bias in the learning process. Only need to set the number of hidden layer neuron nodes, by using the least square method to generate the unique optimal solution, greatly improve the training speed of SLFNs network. And to some extent reduce the probability of overfitting. In 2013, Zong et al proposed a weighted extreme Learning Machine (WELM) algorithm based on ELM. The ELM algorithm is well applied to the unbalanced data set. But the weighting mechanism of WELM is fixed. For the two-classification problem, the total number of samples of most class A is sumA. The total sample of a few class B is sumb, it chooses to add the weight value of 1% sumA to the class A sample, and add the weight value of 1% sumB to the class B sample. This method is obviously not the optimal solution. This paper works from three aspects: first. This paper discusses the influence of the hidden layer output weight on the extreme learning machine to deal with the problem of non-equilibrium classification. In order to understand directly how the non-equilibrium data set affects the performance of the ultimate learning machine, we are on multiple data sets. By gradually increasing the unbalance ratio of the data set, it is found from the experiment that the ultimate learning machine achieves the optimal performance when the data set is balanced. The imbalance of data has a direct impact on the classification effect of LLM. Second. A new adaptive hidden layer output weighting strategy is proposed to improve the prediction performance of the weighted extreme learning machine, which can effectively improve the classification performance of the ultimate learning machine on the unbalanced data set. However, the weighting mechanism is too arbitrary. In this paper, an adaptive weighted extreme learning machine (SawELM) is proposed to reduce the influence of misdivision samples on the classifier. A new mechanism for calculating the weight of the output layer is designed. The mechanism consists of the following two modules:. 1. Gradually reducing the weight of the training samples. 2. Dynamically updating the value of the output layer of the misclassified samples. The first module of SawELM reduces the influence of the error classification samples in calculating the weight of the output layer. The second module tells SawELM to adjust the weight of the output layer. For the sample of WELM classification error, on the one hand, when calculating the weight of the output layer, it weakens the influence of the error instance, on the other hand. Increase the output of the sample, so that the sample can be better classified by the classifier. Third. The feasibility and effectiveness of adaptive weighted extreme learning machine are proved by full experimental comparison. In this paper, 50 two-class unbalanced data sets are randomly selected from KEEL data warehouse. The accuracy of SawELM ELM and WELM were compared. The experimental results show that the proposed adaptive mechanism is effective. SawELM significantly improved the unbalanced classification performance of WELM. Compared with ELM and WELM, the G-mean of SawELM was significantly improved. At the same time, the accuracy of SawELM was higher than that of WELM and was comparable to that of ELM.
【学位授予单位】:深圳大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP181
【参考文献】
中国期刊全文数据库 前2条
1 李威龙;范新南;李敏;郑Ou斌;;基于加权极限学习机的异常轨迹检测算法[J];微处理机;2014年01期
2 叶志飞;文益民;吕宝粮;;不平衡分类问题研究综述[J];智能系统学报;2009年02期
,本文编号:1391394
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/1391394.html