图像检索中的标注与排序方法研究

发布时间：2018-03-01 05:19

本文关键词： 图像检索图像分类多示例学习图像排序随机游走集成学习　出处：《山东大学》2012年硕士论文　论文类型：学位论文

【摘要】：由于文本检索的巨大成功,目前主流的图像搜索引擎如Google、百度等对图像检索采用的还是基于文本关键词的方式,即根据图像周围的文本来判断一幅图像与查询的相关性。由于文本描述与图像内容之间可能存在的不匹配现象,图像检索结果往往不尽人意。改善图像搜索结果的方式一般有两种：一是检索前对图像进行语义标注,改善图像关键字的准确度,利用图像的语义进行检索；二是图像检索结果重新排序,就是对搜索结果利用图像和文本本身的特征,对搜索结果进行重新排序,试图将相关图像排在前面,以提高检索结果满意度。现实中图像内容往往具有多义性,传统的图像标注算法会导致准确率的下降。多示例学习是近年来才出现的一种新的学习框架,并以其对多义性对象的出色表示能力而被成功地运用在图像分类标注任务中。多示例学习中多示例包的生成方式是影响多示例学习效果的一个重要因素。本文将重点分析多示例图像分类标注,提出一个全新的图像多示例包生成方式。同时,为了提高分类的泛化能力,本文对集成学习也进行了相关研究。另一方面,图像本身又是一个多模态的对象,比如图像内容,图像相关文本,都是图像的不同模态。现在图像排序的研究对图像多模态特征之间的相互作用利用并不充分,本文通过分析现有的图像排序算法,全面的考虑图像多模态之间的相互关系,将提出一个将图像多模态统一起来进行排序的方法。本文主要做了三方面的研究：1)综合分析现有的基于多示例学习的图像标注算法,提出一个新的基于多示例学习的标注方法。方法中图像被建模成一个高斯混合模型,每个高斯模型作为多示例包中的一个示例。这样每个示例就是一个概率的表示,而非传统的向量形式,能更多的表示图像信息、。2)系统分析集成学习的现有研究成果,提出了一个基于多重集的新的选择性集成方法。方法是一个离散空间的优化问题,因此速度可以保证,同时,算法中分类器有各自的置信度。3)分析现有的图像排序算法的不足,提出一个新的图像排序算法,算法将更加全面的利用图像的多模态特征。算法将图像搜索结果集抽象成一个多重图。图的每个顶点是一副多模态图像,图像之间的多重边表示图像模态之间的相似度。最后利用随机游走模型来进行图像排序。为了验证所提算法的有效性,本文在Corel数据集、LCI数据集合Web Queries数据集上分别对三个算法进行了验证。实验结果证明,本文所提出的算法有效的改善了分类精度和排序效果。
[Abstract]:Because of the great success of text retrieval, the current mainstream image search engines such as Google, Baidu and so on still adopt the method based on text keywords. That is, judging the correlation between an image and the query based on the text around the image. Because of the possible mismatch between the text description and the image content, Image retrieval results are often unsatisfactory. There are generally two ways to improve image search results: one is to annotate the image before retrieval, improve the accuracy of image keywords, and use the image semantics to retrieve; The second is the reordering of image retrieval results, which is to reorder the search results by using the features of the image and the text itself, and try to rank the related images in the front, in order to improve the satisfaction of the retrieval results. In reality, image content is often ambiguous, and traditional image tagging algorithm will lead to the decline of accuracy. Multi-example learning is a new learning framework that has emerged in recent years. It has been successfully used in image classification and tagging tasks because of its excellent representation ability to polysemous objects. The generation of multi-sample packets in multi-example learning is an important factor affecting the effectiveness of multi-sample learning. Focus on the analysis of multi-example image classification tagging, In order to improve the generalization ability of classification, integration learning is also studied in this paper. On the other hand, the image itself is a multi-modal object, such as image content. Image related texts are all different modes of image. Now the research of image sorting is not enough to utilize the interaction between image multi-modal features. This paper analyzes the existing image sorting algorithm. Considering the interrelation of image multimodal, a method of image multi-modal sorting is proposed. This paper mainly focuses on three aspects: (1) Comprehensive analysis of the existing image tagging algorithm based on multi-example learning, and a new annotation method based on multi-example learning, in which the image is modeled as a Gao Si mixed model. Each Gao Si model is used as an example in multiple sample packages. Thus, each example is a representation of probability, rather than a traditional vector form, which can represent more of the existing research results of system analysis integration learning. A new selective ensemble method based on multiple sets is proposed. The method is an optimization problem in discrete space, so the speed can be guaranteed. At the same time, the classifier in the algorithm has its own confidence degree. 3) the shortcomings of the existing image sorting algorithms are analyzed. A new image sorting algorithm is proposed, which makes full use of the multimodal features of the image. The algorithm abstracts the image search result set into a multiplex graph. Each vertex of the graph is a multimodal image. The multiple edges between images represent the similarity between the image modes. Finally, the random walk model is used to sort the images. In order to verify the effectiveness of the proposed algorithm, In this paper, three algorithms are validated on the Web Queries dataset of Corel data set. The experimental results show that the proposed algorithm improves the classification accuracy and sorting effect effectively.
【学位授予单位】：山东大学
【学位级别】：硕士
【学位授予年份】：2012
【分类号】：TP391.41

【共引文献】