面向图像检索和分类的监督哈希方法研究

发布时间：2018-04-20 01:05

本文选题：哈希学习 + 图像检索　；参考：《山东大学》2017年硕士论文

【摘要】：近年来,随着社会和经济的快速发展,互联网技术在我们的日常工作和生活中扮演着越来越重要的角色,而且随着电子设备和社交媒体的普及,多媒体数据(文字、图像和视频等)增长迅速。图像因其丰富的内容和强有力的表达形式更是深受人们的喜爱,并呈现指数型的增长趋势,这就为图像的检索和存储带来一定的挑战。最近邻算法是信息检索中的经典方法,但是在大规模的数据中进行精确的最近邻查找是非常困难的。为了解决这一问题,近似近邻查找算法得到广大研究者的青睐。这种算法因其效率比较高,复杂度相对较低,结果相对准确,在近些年来成为学术研究的热点。对于相似性检索,传统的方法是利用数据的原始特征进行相似性度量。当数据量较小时,这种方法效果比较好。但随着数据量的增大和特征维数的增高,特征匹配因其极高的计算复杂度变得不再适用,同时对设备的存储性能带来了极大的挑战。于是,哈希方法应运而生。基于哈希的方法因其优越的计算和存储性能在近些年来发展迅速,受到越来越多学者和研究人员的关注。基于哈希技术的相似性检索方法将原始空间的数据特征信息映射到二维的海明空间,同时尽可能的保持原始数据的局部特性、语义信息等。通过计算哈希码两两之间的海明距离,近似近邻检索能够很快的返回检索结果。哈希方法具有线性的检索复杂度并且通过将数据转化为紧致的二进制哈希码,大大的降低了存储代价,更有效的利用存储空间,提高了存储设备的性能。因此,由于哈希方法具有很高的性能,进而能够更好的应用到大规模的数据检索任务中。哈希方法可以根据学习过程中是否利用标签分为两类:非监督方法和监督方法。监督哈希方法目的是在哈希码学习过程中充分利用训练数据的特征以及标签,使得学到的哈希码能够保持原始数据的语义特性,因而相对于非监督哈希方法来说具有更高的准确性,更能运用到一些实际的应用中。如今有很多监督哈希方法被研究出来,有一些也有不错的效果。但是大部分的哈希方法都是用来做检索任务而不能用来分类。也就是说,我们并不能够利用哈希码来预测数据的类别,即使哈希码本身含有丰富的语义信息。这是一种极大的信息损失。并且如果我们可以直接利用哈希码进行分类,哈希方法则在实际项目中发挥更高的价值。针对这一问题,我们提出了一种可以进行标签预测的监督哈希学习方法,称为类图保持哈希。这种哈希方法能够将语义标签信息与哈希码融合在一起,学到的哈希码具有丰富的语义信息,并利用学到的映射矩阵和哈希码,直接预测检索数据的标签。该方法首先通过同时保证标签的一致性和保持类图相似性学习到哈希函数,再通过最小化哈希码跟哈希函数之间的量化误差学到哈希码,同时提出了一种迭代的优化方法。该方法在三个图像数据集上进行了实验,并与当前效果比较好的集中哈希方法进行了对比。实验结果表明无论在图像检索还是分类任务上,类图保持哈希都有着比较好的效果。但是,现实生活中只有极少的图片是本身就带有标签的,而大部分图片都没有标签信息。如何能利用少量的标签信息在大规模图像中做检索,是半监督哈希方法所关注的问题。如今很多半监督的哈希方法为了能更好的对目标函数进行优化,往往采用先松弛再对连续数据阈值化的方法,这样会有一些信息损失。并且,为了更好地利用图像本身的特征,很多方法都用了相似性矩阵来保持相似性,而相似性矩阵一般都是n×n的,计算和存储都相当耗时,甚至在大规模的数据集上无法运行。为此,我们提出了一种半监督图割哈希算法,能够利用图割的优化方法直接对哈希码进行优化,减少了因松弛带来的信息损失。同时,我们将相似性矩阵用稀疏嵌入的方法进行了降维,加快了计算的速度。我们在两个数据集上进行了实验,实验结果说明了我们提出的半监督图割哈希在部分标签上与其他几种哈希方法相比,具有比较好的效果。
[Abstract]:In recent years, with the rapid development of society and economy, Internet technology has played an increasingly important role in our daily work and life, and with the popularity of electronic devices and social media, multimedia data (text, images, video and so on) are growing rapidly. The image is more profound because of its rich content and strong expression. It is popular with people and presents an exponential growth trend, which brings some challenges to the retrieval and storage of images. The nearest neighbor algorithm is the classic method of information retrieval, but it is very difficult to carry out accurate nearest neighbor search in large-scale data. In order to solve this problem, the approximate nearest neighbor search algorithm is widely used. Because of its high efficiency, relatively low complexity and relatively accurate results, this algorithm has become a hot topic in academic research in recent years. For similarity retrieval, the traditional method is to use the original characteristics of data to measure the similarity. When the data is small, the method is better. As the increase and the increase of feature dimension, feature matching is no longer applicable because of its high computational complexity. At the same time, it poses a great challenge to the storage performance of the equipment. The similarity retrieval method based on hash technology maps the data feature information of the original space to the two-dimensional Haim space, while maintaining the local characteristics and semantic information of the original data as much as possible. By calculating the Haim distance between the hash code 22, the approximate nearest neighbor retrieval can quickly return the retrieval results. The hash method has a linear retrieval complexity and can greatly reduce the storage cost by converting the data into a compact binary hash code, which makes use of storage space more efficiently and improves the performance of the storage device. Therefore, the hash method has high performance and can be better applied to large-scale data retrieval tasks. In the learning process, the Hashi method can be divided into two categories: unsupervised and supervised methods. The purpose of supervising Hashi method is to make full use of the characteristics and labels of the training data in the Hashi code learning process, so that the Hashi code can keep the semantic characteristics of the original data, thus relative to the unsupervised Hashi. The method is more accurate and can be used in some practical applications. There are many supervised hash methods that have been studied and some have good results. But most hash methods are used to do retrieval tasks and not to be used for classification. In other words, we are not able to use hash code to predict data. Even if Hashima itself contains rich semantic information, it is a great loss of information. And if we can categorize the hash code directly with hash code, the hash method plays a higher value in the actual project. In this case, we propose a supervised hash learning method that can be used for the label prediction. This hash method can combine semantic label information with hash code. The hash code learned has rich semantic information, and uses the learned mapping matrix and hash code to predict the label of the retrieved data directly. This method first guarantees the consistency of the label and keeps the class graph similarity at the same time. The hash function is studied, and a hash code is learned by minimizing the quantization error between hash and hash functions. An iterative optimization method is proposed. The method is tested on three image data sets and compared with the concentrated hash method which has a better effect at present. The experimental results show that no matter in the image retrieval, the hash method is also compared. But, only a few pictures in real life have labels on themselves, but most of them have no label information. How to use a small amount of label information to search in large images is a problem of semi supervised hash method. Now many of them are concerned about the hash method. In order to better optimize the target function, semi supervised hash method often uses a method of threshold relaxation and then continuous data threshold, so there will be some information loss. In order to make better use of the features of the image itself, many methods use the similarity matrix to maintain similarity, and the similarity matrix is generally n x n We propose a semi supervised graph cut hash algorithm, which can optimize the hash code directly by using the optimization method of graph cut to reduce the loss of information caused by relaxation. We have done a dimensionality reduction and speeded up the speed of calculation. We carried out an experiment on two data sets. The experimental results show that the semi supervised graph cut hash is better than the other hash methods on the partial label.

【学位授予单位】：山东大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【相似文献】