面向跨媒体旅游大数据的个性化搜索和云服务系统实现
发布时间:2019-02-24 14:22
【摘要】:随着社交网络的快速发展,互联网上产生了海量的旅游数据,导致信息过载问题的出现。用户从中获取有效信息需要花费很多精力,这使得用户对旅游信息高效搜索的需求越来越高。研究面向跨媒体旅游大数据的个性化搜索与云服务系统具有重要的理论和应用意义。本文完成的主要工作如下:(1)针对旅游领域中用户分享照片资源的特点,提出了一种基于超图的随机游走旅游图片索引方法。这个方法利用超图来建立旅游图片和其附加信息(例如拍摄时间、用户标签等)间的关系,并在图片索引阶段对图片的不同特征进行融合,而在查询时使用传统的视觉词汇模型进行搜索。这个方法综合利用了旅游图片的不同特征,并且避免了在查询阶段和排序阶段进行融合所带来的计算时间和存储空间消耗,提供了一种更加全面且高效的图片索引方法。(2)提出了基于超图随机游走的个性化旅游信息搜索方法,结合旅游图片的特征,综合利用了图片本身的底层图像特征以及图片的标签、地理位置等附加信息,使用超图的方法构造这些特征信息之间的关系,使用随机游走的方法在超图模型上进行搜索并排序。本方法允许用户提供文本标签和图像等多种类型的跨媒体信息作为搜索样例,并能根据用户提供的个性化信息为用户提供个性化的搜索结果。在互联网数据集上的实验表明,与使用单一特征的通用图片搜索方法相比,本方法的搜索结果质量有所提升。(3)针对旅游图片大数据搜索时数据庞大且实时更新的问题,提出了基于云计算的分布式视觉词汇树训练方法和基于分布式视觉词汇树的图像搜索云服务方法。分布式视觉词汇树训练方法基于MapReduce模型的分布式K-means算法,用于并行地训练图像并进行检索,这种分布式视觉词汇树训练方法可以支持在内存中训练大量的图像。实验结果表明当计算单元增加时每个节点的训练时间和内存消耗呈线性减少趋势,加快了跨媒体索引的建立和搜索过程。(4)设计和开发了面向跨媒体旅游大数据的个性化搜索云服务系统。该系统分为多特征索引模块、个性化搜索模块与搜索云服务模块,可为用户提供可靠的个性化旅游数据搜索云服务。
[Abstract]:With the rapid development of social networks, mass travel data are produced on the Internet, which leads to the problem of information overload. It takes a lot of effort for users to obtain effective information, which makes the demand for efficient search of tourism information more and more high. It is of great theoretical and practical significance to study the personalized search and cloud service system for cross-media tourism big data. The main work of this paper is as follows: (1) according to the characteristics of users sharing photo resources in the field of tourism, a hypergraph-based random walk travel image index method is proposed. This method uses hypergraphs to establish the relationship between tourist pictures and their additional information (such as shooting time, user tags, etc.), and fuses the different features of the pictures in the image index stage. The traditional visual lexical model is used to search the query. This method synthesizes the different features of tourist pictures, and avoids the computational time and storage space consumption caused by fusion in query stage and sorting stage. A more comprehensive and efficient method of image indexing is provided. (2) A method of personalized travel information search based on hypergraph random walk is proposed, which combines the features of tourism images. Using the underlying image features of the image itself and the additional information of the image label, geographical location, etc., using the method of hypergraph to construct the relationship between these feature information, A random walk method is used to search and sort hypergraph models. The method allows users to provide multiple types of cross-media information, such as text labels and images, as search samples, and can provide personalized search results for users according to personalized information provided by users. Experiments on Internet dataset show that compared with the general image search method with a single feature, the quality of search results of this method is improved. (3) aiming at the problem of huge data and real-time updating when big data searches tourist images, The training method of distributed visual vocabulary tree based on cloud computing and the method of image searching cloud service based on distributed visual vocabulary tree are proposed. The distributed visual vocabulary tree training method is based on the distributed K-means algorithm of MapReduce model, which is used to train and retrieve images in parallel. This distributed visual vocabulary tree training method can support the training of a large number of images in memory. The experimental results show that the training time and memory consumption of each node decrease linearly when the computing unit increases. It speeds up the establishment and search process of cross-media index. (4) A personalized search cloud service system for cross-media tourism big data is designed and developed. The system is divided into multi-feature index module, personalized search module and search cloud service module, which can provide users with reliable personalized travel data search cloud services.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3;TP393.09
本文编号:2429626
[Abstract]:With the rapid development of social networks, mass travel data are produced on the Internet, which leads to the problem of information overload. It takes a lot of effort for users to obtain effective information, which makes the demand for efficient search of tourism information more and more high. It is of great theoretical and practical significance to study the personalized search and cloud service system for cross-media tourism big data. The main work of this paper is as follows: (1) according to the characteristics of users sharing photo resources in the field of tourism, a hypergraph-based random walk travel image index method is proposed. This method uses hypergraphs to establish the relationship between tourist pictures and their additional information (such as shooting time, user tags, etc.), and fuses the different features of the pictures in the image index stage. The traditional visual lexical model is used to search the query. This method synthesizes the different features of tourist pictures, and avoids the computational time and storage space consumption caused by fusion in query stage and sorting stage. A more comprehensive and efficient method of image indexing is provided. (2) A method of personalized travel information search based on hypergraph random walk is proposed, which combines the features of tourism images. Using the underlying image features of the image itself and the additional information of the image label, geographical location, etc., using the method of hypergraph to construct the relationship between these feature information, A random walk method is used to search and sort hypergraph models. The method allows users to provide multiple types of cross-media information, such as text labels and images, as search samples, and can provide personalized search results for users according to personalized information provided by users. Experiments on Internet dataset show that compared with the general image search method with a single feature, the quality of search results of this method is improved. (3) aiming at the problem of huge data and real-time updating when big data searches tourist images, The training method of distributed visual vocabulary tree based on cloud computing and the method of image searching cloud service based on distributed visual vocabulary tree are proposed. The distributed visual vocabulary tree training method is based on the distributed K-means algorithm of MapReduce model, which is used to train and retrieve images in parallel. This distributed visual vocabulary tree training method can support the training of a large number of images in memory. The experimental results show that the training time and memory consumption of each node decrease linearly when the computing unit increases. It speeds up the establishment and search process of cross-media index. (4) A personalized search cloud service system for cross-media tourism big data is designed and developed. The system is divided into multi-feature index module, personalized search module and search cloud service module, which can provide users with reliable personalized travel data search cloud services.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3;TP393.09
【参考文献】
相关期刊论文 前6条
1 左欣;沈继锋;于化龙;高尚;徐丹;胡春龙;;基于哈希编码学习的图像检索方法[J];江苏科技大学学报(自然科学版);2015年06期
2 宋天勇;赵辉;郑山红;王国春;;基于查询-概念的用户兴趣模型构建[J];吉林大学学报(信息科学版);2015年03期
3 李武军;周志华;;大数据哈希学习:现状与趋势[J];科学通报;2015年Z1期
4 任树怀;;LUCENE搜索算法剖析及优化研究[J];图书馆杂志;2014年12期
5 杨昭;高隽;谢昭;吴克伟;;局部Gist特征匹配核的场景分类[J];中国图象图形学报;2013年03期
6 徐磊;;基于内容的大规模图像检索基本方法[J];科技信息;2013年08期
相关博士学位论文 前3条
1 蒋锴;含地理位置信息的社交媒体挖掘及应用[D];中国科学技术大学;2014年
2 戴金波;基于视觉信息的图像特征提取算法研究[D];吉林大学;2013年
3 尹华罡;基于海量时空数据的路线挖掘与检索[D];中国科学技术大学;2012年
相关硕士学位论文 前3条
1 郭剑飞;基于LDA多模型中文短文本主题分类体系构建与分类[D];哈尔滨工业大学;2014年
2 叶君峰;基于图像的多样化景点搜索[D];上海交通大学;2013年
3 李雪;旅游个性化搜索系统的研究与实现[D];北京邮电大学;2013年
,本文编号:2429626
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2429626.html