面向关系利用的鲁棒图学习方法研究
发布时间:2019-07-04 15:36
【摘要】:关系在现实世界中无处不在。在机器学习研究领域,数据中有两类关系不容忽视:1)样本之间的关系;2)标记之间的关系。大量研究结果表明,对这两类关系的合理利用对提升训练模型的预测能力至关重要。基于图的方法是关系利用的一类主流范型。这方面的代表性工作获得了国际机器学习领域十年最佳论文奖。经过十余年的研究,基于图的方法已取得了许多成果。然而,其学习性能严重依赖于图的构建。现实任务中,图构建通常难以有效确定,使得学习性能的鲁棒性不佳,有时还会出现性能的损害。本硕士论文围绕提升关系利用的鲁棒性这一重要问题展开研究,主要取得了以下创新成果:第一,针对样本关系利用对图构建敏感的问题,提出了基于大间隔准则的图质量判断方法。该方法将鲁棒样本关系利用这一难题形式化为经典半监督支持向量机框架。优化上给出高效的求解算法。实验结果表明,该方法显著提升样本关系利用的鲁棒性,有效避免传统方法会导致性能退化的现象。本论文还进一步将大间隔准则拓展用于带噪样本关系,提出了高效学习算法,有效防止带噪样本关系对性能的危害。第二,针对标记关系利用对图构建敏感的问题,提出了基于分类器构圈的标记关系利用方法。该方法通过将分类器以圈形式构建,克服了传统学习方法在标记关系利用中分类器次序对性能的严重影响。论文分析了该方法的时间复杂度与传统方法相当,不显著增加计算开销。实验结果表明,该方法显著提升标记关系利用的鲁棒性,有效避免传统标记关系利用方法会导致性能不佳的现象。
[Abstract]:Relationships are everywhere in the real world. In the field of machine learning, there are two kinds of relationships in the data that can not be ignored: 1) the relationship between samples and 2) the relationship between markers. A large number of research results show that the rational use of these two kinds of relations is very important to improve the prediction ability of the training model. The graph-based method is a kind of mainstream paradigm of relational utilization. The representative work in this area has won the ten-year best paper award in the field of international machine learning. After more than ten years of research, the graph-based method has made a lot of achievements. However, its learning performance depends heavily on the construction of graphs. In real tasks, graph construction is usually difficult to determine effectively, which makes the robustness of learning performance poor, and sometimes the performance damage. In this thesis, the important problem of improving the robustness of relational utilization is studied, and the following innovative results are obtained: first, aiming at the problem that the utilization of sample relationship is sensitive to graph construction, a graph quality judgment method based on large interval criterion is proposed. In this method, the robust sample relation is transformed into a classical semi-supervised support vector machine framework by using this problem. An efficient algorithm for solving the problem is given. The experimental results show that this method can significantly improve the robustness of sample relationship utilization and effectively avoid the phenomenon that the traditional method will lead to performance degradation. In this paper, the large interval criterion is further extended to the noisy sample relationship, and an efficient learning algorithm is proposed to effectively prevent the performance harm of the noisy sample relationship. Secondly, in order to solve the problem that mark relation utilization is sensitive to graph construction, a marker relation utilization method based on classification circle is proposed. By constructing the classifiers in the form of cycles, this method overcomes the serious influence of the order of classifiers on the performance of traditional learning methods in the utilization of tag relations. In this paper, the time complexity of this method is similar to that of the traditional method, and the computational overhead is not significantly increased. The experimental results show that this method can significantly improve the robustness of marker relationship utilization and effectively avoid the poor performance of traditional marker relationship utilization methods.
【学位授予单位】:南京大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP181
本文编号:2510022
[Abstract]:Relationships are everywhere in the real world. In the field of machine learning, there are two kinds of relationships in the data that can not be ignored: 1) the relationship between samples and 2) the relationship between markers. A large number of research results show that the rational use of these two kinds of relations is very important to improve the prediction ability of the training model. The graph-based method is a kind of mainstream paradigm of relational utilization. The representative work in this area has won the ten-year best paper award in the field of international machine learning. After more than ten years of research, the graph-based method has made a lot of achievements. However, its learning performance depends heavily on the construction of graphs. In real tasks, graph construction is usually difficult to determine effectively, which makes the robustness of learning performance poor, and sometimes the performance damage. In this thesis, the important problem of improving the robustness of relational utilization is studied, and the following innovative results are obtained: first, aiming at the problem that the utilization of sample relationship is sensitive to graph construction, a graph quality judgment method based on large interval criterion is proposed. In this method, the robust sample relation is transformed into a classical semi-supervised support vector machine framework by using this problem. An efficient algorithm for solving the problem is given. The experimental results show that this method can significantly improve the robustness of sample relationship utilization and effectively avoid the phenomenon that the traditional method will lead to performance degradation. In this paper, the large interval criterion is further extended to the noisy sample relationship, and an efficient learning algorithm is proposed to effectively prevent the performance harm of the noisy sample relationship. Secondly, in order to solve the problem that mark relation utilization is sensitive to graph construction, a marker relation utilization method based on classification circle is proposed. By constructing the classifiers in the form of cycles, this method overcomes the serious influence of the order of classifiers on the performance of traditional learning methods in the utilization of tag relations. In this paper, the time complexity of this method is similar to that of the traditional method, and the computational overhead is not significantly increased. The experimental results show that this method can significantly improve the robustness of marker relationship utilization and effectively avoid the poor performance of traditional marker relationship utilization methods.
【学位授予单位】:南京大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP181
【相似文献】
相关硕士学位论文 前1条
1 王少博;面向关系利用的鲁棒图学习方法研究[D];南京大学;2017年
,本文编号:2510022
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2510022.html