基于词汇树检索的智能手机图书感知系统
发布时间:2018-04-22 21:13
本文选题:数字图像处理 + 词汇树 ; 参考:《北京邮电大学》2013年硕士论文
【摘要】:随着现在智能移动终端的升级换代以及移动通信技术的高速发展,智能手机终端通过移动网络接入的方式给人们带来了全新的互联网体验。随着图像多媒体信息的日益增加,传统的文字检索引擎已经无法很好的满足用户的检索需求,在这样的背景下,图像搜索引擎开始应运而生,而这其中,以图书检索应用最为广泛。基于手机的图书检索应用,一般需要根据图书的条形码或者图书封面图片作为检索依据,这样的应用,每次只能检索一本图书,并且缺少对图书相关信息的有效整合。 针对传统手机图书检索应用的不足,综合考虑书架图书应用场景的特点,本文设计并实现了一款基于词汇树检索的智能手机图书感知系统。该系统通过手机获取书架上排列在一起的图书图片上传到服务器完成图书检索工作,并通过网页爬虫系统为手机用户提供更为详尽的图书相关信息。 本系统为了提高检索的准确度,首先需要区分查询图片中每一本相邻图书的书脊边缘线。在详细分析书架图书的图像特征基础上,结合多种数字图像处理技术的特点,通过边缘提取、角度方向提取、过滤短边缘、滤波、直线提取等方法提取相邻图书之间的边缘线,实现相邻图书边缘的有效分割,并通过测试验证算法的效率以及准确性。 然后,实现基于词汇树的图像检索算法识别每一本图书,该图像检索算法在传统的SIFT特征提取算法以及视觉特征袋分类方法的基础上,利用k-means分层聚类算法生成视觉词汇,然后采用TF-IDF的加权方式,有效的提高图像检索的效率。 同时,为了整合不同网站的图书信息,本文设计并实现了网页图书信息主题爬虫系统。通过分析信息抓取的特点以及网站源代码,利用该爬虫系统从相应网站抓取需要的图书信息并存储到数据库中,整合用户较为关心的图书信息,最终为用户提供一款图像检索与Web信息检索相结合的手机图书感知系统。
[Abstract]:With the upgrading of intelligent mobile terminals and the rapid development of mobile communication technology, smart phone terminals have brought people a new Internet experience through the way of mobile network access. With the increasing of image multimedia information, the traditional text retrieval engine has been unable to meet the retrieval needs of users. In this context, the image search engine began to emerge as the times require, and among them, Book retrieval is the most widely used. The application of book retrieval based on mobile phone generally needs to be based on the bar code or the cover picture of the book as the retrieval basis. In such applications, only one book can be retrieved at a time, and there is a lack of effective integration of the relevant information of the book. Aiming at the deficiency of traditional mobile phone book retrieval application and considering the characteristics of bookshelf book application scene, this paper designs and implements a smart phone book perception system based on lexical tree retrieval. The system acquires the books arranged together on the bookshelf by mobile phone and uploads them to the server to complete the book retrieval work, and provides more detailed information about the books to the mobile phone users through the web crawler system. In order to improve the retrieval accuracy, the system first needs to distinguish the edge of each adjacent book in the query picture. Based on the detailed analysis of the image features of bookshelf books and the characteristics of various digital image processing techniques, the edge lines between adjacent books are extracted by means of edge extraction, angle direction extraction, filtering short edge, filtering, line extraction and so on. The efficient segmentation of adjacent book edges is realized, and the efficiency and accuracy of the algorithm are verified by testing. Then, the image retrieval algorithm based on lexical tree is implemented to recognize every book. Based on the traditional SIFT feature extraction algorithm and the classification method of visual feature bag, the image retrieval algorithm uses k-means hierarchical clustering algorithm to generate visual vocabulary. Then the weighted method of TF-IDF is used to improve the efficiency of image retrieval. At the same time, in order to integrate the book information of different websites, this paper designs and implements the web book information subject crawler system. By analyzing the characteristics of information capture and the source code of the website, the crawler system is used to capture the required book information from the corresponding website and store it in the database, so as to integrate the book information concerned by the user. Finally, it provides a mobile phone book perception system which combines image retrieval and Web information retrieval.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.41;TP391.3;TN929.5
【参考文献】
相关期刊论文 前6条
1 金微;陈慧萍;;基于分层聚类的k-means算法[J];河海大学常州分校学报;2007年01期
2 何友金;李楠;;舰船红外图像边缘检测方法对比研究[J];计算机仿真;2006年04期
3 谢国强;蓝立新;;基于Web的网络爬虫技术研究[J];科教文汇(中旬刊);2008年04期
4 姜毅;王兆青;曹丽;;基于HTTP的实时信息传输方法[J];计算机工程与设计;2008年10期
5 邢军;基于Sobel算子数字图像的边缘检测[J];微机发展;2005年09期
6 李国晶;王景强;;浅析正则表达式[J];科技资讯;2010年04期
相关硕士学位论文 前1条
1 王斐;基于增量反馈和自适应机制的主题爬虫系统的设计与实现[D];南京理工大学;2005年
,本文编号:1788961
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1788961.html