基于语义一致性和矩阵分解的跨模态哈希检索研究
发布时间:2018-07-24 09:50
【摘要】:多模态是大数据的重要特性,随着大数据时代的到来,像图像检索文本之类的跨模态数据之间的检索已成为潜在的需求。跨模态哈希(Cross-Modal Hashing)方法通过哈希函数将查询数据转变为汉明空间中的二进制编码,即哈希编码,形式上统一了各模态数据,从而将跨模态数据之间的检索转变为哈希编码之间的检索,降低了存储消耗同时加快了检索速度。另外,哈希编码之间通常保持了对应数据之间的相似性,包括模态内相似性和模态间相似性。相似性保持是本文研究的出发点,同时也是跨模态哈希方法的重要组成部分。然而当前大多数跨模态哈希方法仅依据底层特征对数据之间的相似性进行度量,忽略了语义的重要性,不利于缩小语义鸿沟,也不利于提高检索的准确率。人类是从语义层面对事物进行区分和判断的,因此数据之间的真实关系取决于语义。在底层特征具有噪声或者判别性不强时,语义相似性的使用有利于生成具有较好判别性的哈希编码,进而提高检索的准确率。本文从语义层面度量模态内相似性和模态间相似性,提出了两种跨模态哈希方法,分别为:语义一致性跨模态哈希与基于语义一致性和矩阵分解的跨模态哈希。通过在现存的两个主流的数据集上进行实验,验证了方法的有效性。本文的主要研究内容和创新点:(1)语义一致性跨模态哈希仅使用语义度量数据之间的相似性,降低了计算量和哈希编码到高层语义的语义鸿沟,确保哈希编码之间的相似性与原始数据之间的相似性具有语义上的一致性。哈希函数通过线性映射和二值化将数据转变为哈希编码。(2)基于语义一致性和矩阵分解的跨模态哈希同时利用语义和底层特征度量各模态内数据之间的相似性,并用图指示该相似性,缩小了底层特征到高层语义,以及哈希编码到高层语义之间的语义鸿沟。利用矩阵分解构建各模态数据共同的抽象空间,实现数据的抽象表达,并通过量化抽象表达产生相应的哈希编码,最终将哈希函数的学习转换成二元分类中超平面的学习。
[Abstract]:Multi-modal is an important feature of big data. With the advent of big data era, cross-modal data retrieval such as image retrieval text has become a potential demand. The cross-modal hash (Cross-Modal Hashing) method transforms the query data into binary encoding in the hamming space by hash function, which formally unifies the modal data. Thus, the retrieval between the cross-modal data is transformed into the retrieval between hash codes, which reduces the storage consumption and speeds up the retrieval. In addition, the similarity of corresponding data between hash codes is usually maintained, including intra-modal similarity and modal similarity. Similarity preservation is not only the starting point of this paper, but also an important part of cross-modal hash method. However, most cross-modal hash methods only measure the similarity of data based on the underlying features, ignoring the importance of semantics, which is not conducive to narrowing the semantic gap and improving the accuracy of retrieval. Human beings distinguish and judge things from semantic level, so the true relationship between data depends on semantics. When the underlying feature is noisy or discriminant, the use of semantic similarity is helpful to generate a good discriminant hash code and improve the retrieval accuracy. In this paper, two kinds of cross-modal hash methods are proposed from the semantic level, which are semantic consistency cross-modal hash and cross-modal hash based on semantic consistency and matrix decomposition. The validity of the method is verified by experiments on two existing data sets. The main contents and innovations of this paper are as follows: (1) semantic consistency cross-modal hash only uses semantic metrics to measure the similarity between data, which reduces the semantic gap between computation and hash coding to high-level semantics. Ensure semantic consistency between the similarity between hash coding and raw data. The hash function transforms the data into hash coding by linear mapping and binarization. (2) Cross-modal hash based on semantic consistency and matrix decomposition simultaneously uses semantic and underlying features to measure the similarity of data within each modal. The similarity is indicated by the graph, which reduces the semantic gap between the underlying features and high-level semantics, and between hash coding and high-level semantics. The abstract space of each modal data is constructed by matrix decomposition, and the abstract representation of data is realized, and the corresponding hash code is generated by quantifying abstract representation. Finally, the learning of hash function is transformed into the learning of hyperplane in binary classification.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41
本文编号:2140991
[Abstract]:Multi-modal is an important feature of big data. With the advent of big data era, cross-modal data retrieval such as image retrieval text has become a potential demand. The cross-modal hash (Cross-Modal Hashing) method transforms the query data into binary encoding in the hamming space by hash function, which formally unifies the modal data. Thus, the retrieval between the cross-modal data is transformed into the retrieval between hash codes, which reduces the storage consumption and speeds up the retrieval. In addition, the similarity of corresponding data between hash codes is usually maintained, including intra-modal similarity and modal similarity. Similarity preservation is not only the starting point of this paper, but also an important part of cross-modal hash method. However, most cross-modal hash methods only measure the similarity of data based on the underlying features, ignoring the importance of semantics, which is not conducive to narrowing the semantic gap and improving the accuracy of retrieval. Human beings distinguish and judge things from semantic level, so the true relationship between data depends on semantics. When the underlying feature is noisy or discriminant, the use of semantic similarity is helpful to generate a good discriminant hash code and improve the retrieval accuracy. In this paper, two kinds of cross-modal hash methods are proposed from the semantic level, which are semantic consistency cross-modal hash and cross-modal hash based on semantic consistency and matrix decomposition. The validity of the method is verified by experiments on two existing data sets. The main contents and innovations of this paper are as follows: (1) semantic consistency cross-modal hash only uses semantic metrics to measure the similarity between data, which reduces the semantic gap between computation and hash coding to high-level semantics. Ensure semantic consistency between the similarity between hash coding and raw data. The hash function transforms the data into hash coding by linear mapping and binarization. (2) Cross-modal hash based on semantic consistency and matrix decomposition simultaneously uses semantic and underlying features to measure the similarity of data within each modal. The similarity is indicated by the graph, which reduces the semantic gap between the underlying features and high-level semantics, and between hash coding and high-level semantics. The abstract space of each modal data is constructed by matrix decomposition, and the abstract representation of data is realized, and the corresponding hash code is generated by quantifying abstract representation. Finally, the learning of hash function is transformed into the learning of hyperplane in binary classification.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41
【参考文献】
相关期刊论文 前2条
1 詹炜;;流形学习算法概述[J];武汉船舶职业技术学院学报;2013年02期
2 徐蓉;姜峰;姚鸿勋;;流形学习概述[J];智能系统学报;2006年01期
,本文编号:2140991
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2140991.html