基于Group Lasso的半监督哈希图像搜索优化及算法研究

发布时间:2018-03-02 19:10

  本文选题:半监督学习 切入点:Group 出处:《华东师范大学》2017年硕士论文 论文类型:学位论文


【摘要】:随着大数据时代的到来、互联网技术的飞速发展和成像设备的日渐普及,图像等媒体资源的数据采集越来越便捷,图像在医学、天文学、刑侦、交通、军事、环境学、社交网络等各行各业发挥着至关重要的作用。面对以几何速度增长的网络图像数量,传统的图像数据的分析和处理面临资源基数庞大、特征维度高、需要的存储空间大、查询速度慢等方面的挑战,研究大规模图像搜索算法的需求日益迫切:考虑现实生活中的数据标记现状,尤其是在图像搜索与识别等领域,最大量的、获取最便捷的数据都是没有标签的,因此基于半监督学习的图像搜索算法是具有重大现实意义和实际需求的算法;图像数据本身具有颜色、形状、纹理等特征,图像数据的不同维度之间可能具有某些结构或语义联系,有效的建模图像数据的结构非常关键;此外,设计有效的求解算法对大规模数据集上的图像搜索至关重要。本文在以上几点现实背景下,在已有的半监督哈希图像搜索算法基础上,进行了以下工作:1.提出了基于Group Lasso的半监督哈希图像搜索算法。半监督的学习方法充分利用了所有带标签和无标签的训练数据;哈希图像搜索算法只存储图像的二进制哈希码,节省了存储空间且只需常数的查询时间;还可以推广至超大规模图像数据集搜索。2.用Group Lasso将组结构考虑进图像搜索模型,使同组的特征具有了同时选入或同时剔除出模型的特性。通过Group Lasso引入了组间稀疏性,起到了特征选择的作用,避免了过拟合并提高了模型的准确性。3.求解基于Group Lasso的半监督哈希图像搜索模型时引入了邻近梯度法优化模型并快速求解。4.在标准图像数据库MNIST和CIFAR10上测试模型,并与已有的其他图像搜索算法对比。
[Abstract]:With the arrival of big data era, the rapid development of Internet technology and the increasing popularity of imaging equipment, image and other media resources data collection more and more convenient, images in medicine, astronomy, criminal investigation, transportation, military, environmental science, Social networks and other industries play a vital role. In the face of the geometric growth of the number of network images, the traditional image data analysis and processing faces a huge resource base, high feature dimension, large storage space. Due to the challenge of slow query speed and so on, it is increasingly urgent to study large-scale image search algorithms: to consider the current situation of data marking in real life, especially in the field of image search and recognition. Therefore, the image search algorithm based on semi-supervised learning is of great practical significance and practical need, and the image data itself has the characteristics of color, shape, texture, etc. There may be some structural or semantic connection between the different dimensions of the image data, and effective modeling of the structure of the image data is critical; in addition, It is very important to design an effective algorithm for image search on large scale data sets. In this paper, based on the existing semi-supervised hash image search algorithms, The following work is done: 1. A semi-supervised hashing image search algorithm based on Group Lasso is proposed. The semi-supervised learning method makes full use of all tagged and untagged training data; the hash image search algorithm only stores binary hash codes of images. It saves storage space and requires only constant query time; it can also be extended to very large scale image dataset search. 2. The group structure is taken into account in the image search model with Group Lasso. The features of the same group have the characteristics of selecting or removing the model at the same time. The sparsity between groups is introduced through Group Lasso, which plays the role of feature selection. By avoiding over-fitting and merging, the accuracy of the model is improved. 3. In solving the semi-supervised hash image search model based on Group Lasso, the optimization model of adjacent gradient method is introduced and the model is quickly solved. The model is tested on the standard image database MNIST and CIFAR10. And compared with other existing image search algorithms.
【学位授予单位】:华东师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41

【参考文献】

相关期刊论文 前1条

1 ;“互联网+”人工智能三年行动实施方案[J];中国信息技术教育;2016年11期



本文编号:1557803

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/1557803.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户2290d***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com