在线序列主动学习方法

发布时间：2018-01-05 17:22

本文关键词：在线序列主动学习方法　出处：《计算机科学》2017年01期 　论文类型：期刊论文

【摘要】：现实世界中存在着大量无类标的数据,如各种医疗图像数据、网页数据等。在大数据时代,这种情况更加突出。标注这些无类标的数据需要付出巨大的代价。主动学习是解决这一问题的有效手段,也是近几年机器学习和数据挖掘领域中的一个研究热点。提出了一种基于在线序列极限学习机的主动学习算法,该算法利用在线序列极限学习机增量学习的特点,可显著提高学习系统的效率。另外,该算法用样例熵作为启发式度量无类标样例的重要性,用K-近邻分类器作为Oracle标注选出的无类标样例的类别。实验结果显示,提出的算法具有学习速度快、标注准确的特点。
[Abstract]:In the real world, there are a large number of unclassified data, such as various medical image data, web data, etc. In the big data era. This situation is more prominent. Tagging these unmarked data has to pay a huge price. Active learning is an effective means to solve this problem. It is also a hot topic in the field of machine learning and data mining in recent years. An active learning algorithm based on online sequence limit learning machine is proposed. The algorithm can significantly improve the efficiency of learning system by using the characteristics of incremental learning of online sequence limit learning machine. In addition, the algorithm uses sample entropy as a heuristic to measure the importance of non-class sample. The K-nearest neighbor classifier is used as the classification of the non-class sample selected by Oracle. The experimental results show that the proposed algorithm has the advantages of fast learning speed and accurate labeling.
【作者单位】：河北大学数学与信息科学学院河北省机器学习与计算智能重点实验室;河北大学计算机科学与技术学院;中国气象局气象干部培训学院河北分院;
【基金】：国家自然科学基金项目(71371063) 河北省自然科学基金项目(F2013201220) 河北省高等学校科学技术研究重点项目(ZD20131028) 河北省高等学校科学技术研究项目(QN20131153)资助
【分类号】：TP181
【正文快照】： 1引言主动学习是一种有监督学习。与传统的被动学习不同,在主动学习中,学习器不是被动地接受、处理人类提供的所有数据,而是主动地选取它所认为最有价值的数据,并交由领域专家进行标注。主动学习的目标是在可接受精度的前提下,选取尽可能少的样例以减小标注和学习的代价。，

本文编号：1384098

资料下载

论文发表

支付宝下载

Download by Alipay
微信下载

Download by Wechat
会员下载

Download by Member

本文链接：https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/1384098.html

上一篇：SRM直接转矩滑模变结构控制系统研究
下一篇：一种组合结构光纤光栅压力传感器

论文发表

·知网|万方|维普|龙源|省级|国家级|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|