加权KNN的图文数据融合分类
发布时间:2018-01-19 23:52
本文关键词: 图文数据 softmax多分类器 多分类支持向量机 加权KNN 融合分类方法 出处:《中国图象图形学报》2016年07期 论文类型:期刊论文
【摘要】:目的图文数据在不同应用场景下的最佳分类方法各不相同,而现有语义级融合算法大多适用于图文数据分类方法相同的情况,若将其应用于不同分类方法时由于分类决策基准不统一导致分类结果不理想,大幅降低了融合分类性能。针对这一问题,提出基于加权KNN的融合分类方法。方法首先,分别利用softmax多分类器和多分类支持向量机(SVM)实现图像和文本分类,同时利用训练数据集各类别分类精确度加权后的图像和文本正确判别实例的分类决策值分别构建图像和文本KNN模型;再分别利用其对测试实例的图像和文本分类决策值进行预测,通过最邻近k个实例属于各类别的数目确定测试实例的分类概率,统一图像和文本的分类决策基准;最后利用训练数据集中图像和文本分类正确的数目确定测试实例中图像和文本分类概率的融合系数,实现统一分类决策基准下的图文数据融合。结果在Attribute Discovery数据集的图像文本对上进行实验,并与基准方法进行比较,实验结果表明,本文融合算法的分类精确度高于图像和文本各自的分类精确度,且平均分类精确度相比基准方法提高了4.45%;此外,本文算法对图文信息的平均整合能力相比基准方法提高了4.19%。结论本文算法将图像和文本不同分类方法的分类决策基准统一化,实现了图文数据的有效融合,具有较强的信息整合能力和较好的融合分类性能。
[Abstract]:Objective the optimal classification methods of graphic and text data in different application scenarios are different, and most of the existing semantic level fusion algorithms are suitable for the same classification methods of graphic and text data. If it is applied to different classification methods, the classification result is not ideal due to the disunity of classification decision criteria, which greatly reduces the performance of fusion classification. A fusion classification method based on weighted KNN is proposed. Firstly, image and text classification are realized by using softmax multi-classifier and multi-classification support vector machine respectively. At the same time, the KNN model of image and text are constructed by using the classification decision value of the accurate classification accuracy of each category of training data set. Then we use it to predict the decision value of image and text classification of test cases, and determine the classification probability of test cases by the number of the nearest k instances belonging to different kinds of others. Unified image and text classification decision-making benchmark; Finally, using the correct number of image and text classification in the training data set, the fusion coefficient of the probability of image and text classification in test examples is determined. The results are compared with the benchmark method and the experimental results are carried out on the image text pairs of the Attribute Discovery data set. The experimental results show that the classification accuracy of the fusion algorithm is higher than that of image and text, and the average classification accuracy is 4.45% higher than the baseline method. In addition, the average integration ability of the algorithm is 4.19% higher than that of the benchmark method. Conclusion the algorithm unifies the classification decision benchmark of different image and text classification methods. It realizes the effective fusion of graph and text data, and has strong ability of information integration and better performance of fusion and classification.
【作者单位】: 中国科学院电子学研究所;中国科学院大学;
【基金】:国家自然科学基金项目(41301493) 高分对地观测领域学术交流基金项目(GFEX04060103)~~
【分类号】:TP391.41;TP18
【正文快照】: 0引言 随着互联网技术的发展,数据量呈现爆炸式增长,数据类型不再局限于单一的文本,而是扩展到图像、音频、视频等多媒体数据。其中图像以其丰富的视觉特征,将抽象数据直观、生动、形象的呈现给人们,使得信息的传播和交流更为便捷。互联网多媒体数据规模大、类型多、组织结构,
本文编号:1446001
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1446001.html