基于相关反馈的图像搜索引擎的研究与实现
发布时间:2018-04-13 10:45
本文选题:图像检索 + 相关反馈 ; 参考:《南京邮电大学》2016年硕士论文
【摘要】:日新月异的计算机技术给人们带来了崭新的生活体验与工作方式。随着网络带宽的增加,用户可以更加快速的利用互联网对网站进行访问,但互联网中庞大的数据量也使得用户在查询特定消息的过程如同大海捞针。因此,一个全新的信息搜索技术“搜索引擎”应时而生,并在短时间内得到快速的发展和改进。由于互联网中多媒体版块的不断丰富,用户对搜索内容便有了更多的需求。其中为了满足用户对图像搜索的要求,从之前发展比较成熟的基于文字的图像搜索技术,到现在逐渐完善的基于内容的图像搜索技术,各种各样基于互联网图像的搜索技术蓬勃发展。通常,用户在搜索图像时,最为关心的便是搜索到的结果与用户的期望值是否相符。因此本文结合了基于文字和基于内容的搜索引擎的技术特点,提出了基于相关反馈的搜索引擎并加以实现。首先介绍了图像搜索系统的相关背景和研究意义,其次简单描述了搜索系统要用到的关键技术,包括链接爬取、内容提取、图像爬取、索引建立等等,以此为开发出一个完整的搜索系统提供了必要的理论和技术准备。本文在第三章中详细阐述了基于相关反馈的搜索引擎的框架结构,它包含用户接口模块、图像处理模块、数据爬取模块,并在这些模块中添加了用户的相关反馈机制。第四章在第三章所提出的框架与流程的基础上具体实现了相关功能。在数据搜索模块,本文通过在爬取过程中使用HtmlUnit插件,解决了一般Spider只能爬取静态页面而无法解析动态页面的问题。在图像处理模块,本文在特征提取方面侧重于图像低层次特征的提取,通过使用感知哈希算法,对图像的形状、纹理特征以数字的形式呈现出来。最后通过相关测试,检验了所有模块的功能,验证了本文所提搜索引擎与其它搜索引擎相比具有较高的查准率。本论文为实践应用型研究型论文,目的在于研究和实现基于相关反馈的图像搜索系统。它在传统的搜索系统模式之外另辟蹊径,改进了搜索系统的查准率与查全率,同时还改善了用户在搜索过程中的体验度。
[Abstract]:The rapid development of computer technology has brought people a new life experience and working style.With the increase of network bandwidth, users can use the Internet to visit the website more quickly, but the huge amount of data in the Internet also makes the process of searching for specific messages like looking for a needle in a haystack.Therefore, a new information search technology "search engine" came into being, and in a short period of time, rapid development and improvement.Because of the continuous enrichment of multimedia sections in the Internet, users have more demand for searching content.In order to meet the requirements of the users for image search, from the more mature text-based image search technology developed before, to the content based image search technology,All kinds of search technology based on Internet image is booming.In general, when searching for images, the most important concern is whether the results are in line with the user's expectations.Therefore, combining the technical characteristics of text and content-based search engines, this paper proposes and implements a search engine based on correlation feedback.This paper introduces the background and research significance of image search system, and then briefly describes the key technologies to be used in the search system, including link crawling, content extraction, image crawling, index building and so on.This provides the necessary theoretical and technical preparation for the development of a complete search system.In the third chapter, the framework of search engine based on correlation feedback is described in detail. It includes user interface module, image processing module, data crawling module, and the relevant feedback mechanism of users is added to these modules.The fourth chapter realizes the related functions on the basis of the framework and process proposed in the third chapter.In the data search module, this paper solves the problem that Spider can only crawl static pages but can not parse dynamic pages by using HtmlUnit plug-in during crawling.In the image processing module, this paper focuses on feature extraction in the image low-level feature extraction, through the use of perceptual hashing algorithm, image shape and texture features are presented in digital form.Finally, the functions of all modules are tested through the relevant tests, and it is verified that the search engine proposed in this paper has a higher precision than other search engines.The purpose of this paper is to study and implement an image search system based on correlation feedback.In addition to the traditional search system, it improves the precision and recall of the search system, and also improves the user's experience in the search process.
【学位授予单位】:南京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.41
【相似文献】
相关期刊论文 前10条
1 孟勇,洪丹辉,毛丹;测度熵在图像纹理分析中的应用[J];计算机应用与软件;2000年08期
2 吴涛;秦昆;;图像纹理特征数据挖掘的理论与方法探讨[J];计算机时代;2006年08期
3 方玲玲;王相海;;图像挖掘研究[J];计算机科学;2009年08期
4 高振宇;杨晓梅;龚剑明;金海;;图像复杂度描述方法研究[J];中国图象图形学报;2010年01期
5 刘勇,施万昌,徐玉兰;图像差异的分析与识别[J];复旦学报(自然科学版);2000年05期
6 罗l,
本文编号:1744158
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1744158.html