当前位置:主页 > 科技论文 > 自动化论文 >

应用k-means算法实现标记分布学习

发布时间:2018-07-28 16:36
【摘要】:标记分布学习是近年来提出的一种新的机器学习范式,它能很好地解决某些标记多义性的问题。现有的标记分布学习算法均利用条件概率建立参数模型,但未能充分利用特征和标记间的联系。本文考虑到特征相似的样本所对应的标记分布也应当相似,利用原型聚类的k均值算法(k-means),将训练集的样本进行聚类,提出基于kmeans算法的标记分布学习(label distribution learning based on k-means algorithm,LDLKM)。首先通过聚类算法kmeans求得每一个簇的均值向量,然后分别求得对应标记分布的均值向量。最后将测试集和训练集的均值向量间的距离作为权重,应用到对测试集标记分布的预测上。在6个公开的数据集上进行实验,并与3种已有的标记分布学习算法在5种评价指标上进行比较,实验结果表明提出的LDLKM算法是有效的。
[Abstract]:Label distributed learning is a new machine learning paradigm proposed in recent years. It can solve some problems of label polysemy. The existing algorithm of label distribution learning uses conditional probability to establish parameter model, but it fails to make full use of the relationship between feature and marker. In this paper, we consider that the label distribution of the samples with similar features should also be similar. Using the k-means algorithm (k-means) of the prototype clustering, the samples of the training set are clustered, and the label distribution based on the kmeans algorithm is proposed to learn the (label distribution learning based on k-means algorithm (LDLKM). First, the mean vector of each cluster is obtained by clustering algorithm kmeans, and then the mean vector of the corresponding label distribution is obtained respectively. Finally, the distance between the mean vector of the test set and the training set is used as the weight to predict the marked distribution of the test set. The experiments are carried out on six open data sets and compared with three existing label distributed learning algorithms on five evaluation indexes. The experimental results show that the proposed LDLKM algorithm is effective.
【作者单位】: 闽南师范大学粒计算重点实验室;
【基金】:国家自然科学基金项目(61379049,61379089)
【分类号】:TP181


本文编号:2150903

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2150903.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户6fe45***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com