主动学习停止准则与评价测度研究

发布时间：2018-09-11 19:43

【摘要】：主动学习是机器学习领域中最为活跃的研究方向之一,其旨在花费尽可能少的人类标注代价获得性能较高的分类模型。因此,在主动学习过程中,能否定义一个合适的停止准则对主动学习是否能发挥出最大效应具有重大意义。此外,在对一种主动学习算法的性能进行评估时,往往需要定义一些定量的评价测度,而这正是前人工作所忽略的问题。故本文主要针对上述两类问题展开研究。本文首先介绍了几种常用的主动学习停止准则,进而针对现有的选择精度主动学习停止准则仅适用于批量样例标注场景这一缺点,提出了一种改进的适用于单轮单样例标注场景的选择精度停止准则。该准则通过监督自本轮起前溯的固定学习轮次内的预测标记与真实标记间的匹配关系,对选择精度进行近似的评估计算,匹配度越高则选择精度越高。继而利用滑动时间窗实时监测该选择精度的变化,若当其高于事先设定的阈值时,则停止主动学习算法的运行。以基于支持向量机的主动学习方法为例,通过6个基准数据集对该准则的有效性与可行性进行了验证,结果表明当选取合适的阈值时,该准则能找到主动学习停止的合理时机。该方法扩大了选择精度停止准则的适用范围,提升了其实用性。目前,适用于主动学习的算法多种多样,但这些主动学习算法都共用一个统一的性能评估测度,即学习曲线。学习曲线在整个主动学习迭代过程中能够很好的区分分类模型间的性能差异,因此大多数文章都使用学习曲线作为比较不同分类算法性能的标准。但是对于两个分类性能相近的主动学习算法,很难从学习曲线的分布上观察到性能变化的细微差异。针对这一问题,通过深入挖掘学习曲线中所隐藏的信息,提出了四种定量的主动学习性能评估测度,分别为学习曲线下的面积、对数化的学习曲线下的面积、平均梯度角以及对数化的平均梯度角。在比较基于同质分类器的主动学习算法时,这四种度量测度均能够保证评估结果的公正性;而对于异质的分类器,在比较不同的主动学习算法性能时,平均梯度角以及对数化的平均梯度角比另外两种评估测度可能更加适用。此外,对数化的学习曲线下的面积与对数化的平均梯度角则更关注于主动学习初始学习阶段的性能提升速率。通过在9个数据集以及多个基准主动学习算法上的大量实验验证了上述四种测度的实用性。
[Abstract]:Active learning is one of the most active research fields in the field of machine learning. Therefore, it is of great significance to define an appropriate stop criterion in the process of active learning. In addition, when evaluating the performance of an active learning algorithm, it is often necessary to define some quantitative evaluation measures, which is the problem neglected by the previous work. Therefore, this paper mainly focuses on the above two kinds of problems. In this paper, we first introduce several commonly used active learning stopping criteria, and then aim at the disadvantage that the existing active learning stopping criteria with selective precision are only suitable for batch sample tagging scenarios. In this paper, an improved precision stopping criterion for single-wheel single-sample scene selection is proposed. By monitoring the matching relationship between prediction marks and real markers in a fixed learning cycle from the beginning of this round, the criterion evaluates and calculates the selection accuracy approximately, and the higher the matching degree is, the higher the selection accuracy is. Then the sliding time window is used to monitor the change of the selection accuracy in real time, and when the threshold is higher than the pre-set threshold, the active learning algorithm is stopped. Taking the active learning method based on support vector machine as an example, the validity and feasibility of the criterion are verified by six datum data sets. The results show that the criterion can find a reasonable time to stop active learning when the appropriate threshold is selected. This method expands the scope of application of the selective precision stop criterion and improves its practicability. At present, there are a variety of algorithms for active learning, but these active learning algorithms all share a unified performance evaluation measure, that is, learning curve. The learning curve can distinguish the performance difference between classification models well in the whole active learning iterative process, so most articles use learning curve as the standard to compare the performance of different classification algorithms. However, for two active learning algorithms with similar classification performance, it is difficult to observe the subtle variation of performance in the distribution of learning curves. In order to solve this problem, by digging the hidden information in the learning curve, four kinds of quantitative active learning performance evaluation measures are proposed, which are the area under the learning curve and the area under the logarithmic learning curve. The average gradient angle and the logarithmic average gradient angle. When comparing active learning algorithms based on homogeneous classifiers, these four metrics can ensure the fairness of the evaluation results, while for heterogeneous classifiers, when comparing the performance of different active learning algorithms, The average gradient angle and the logarithmic average gradient angle may be more suitable than the other two evaluation measures. In addition, the area under logarithmic learning curve and the average gradient angle of logarithmic learning pay more attention to the performance improvement rate in the initial learning stage of active learning. The practicability of the four measures is verified by a large number of experiments on 9 datasets and multiple benchmark active learning algorithms.
【学位授予单位】：江苏科技大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP181

【相似文献】