压缩的XML数据查询处理算法研究
发布时间:2018-06-18 07:27
本文选题:XML编码 + XML压缩 ; 参考:《福建师范大学》2013年硕士论文
【摘要】:XML技术问世以来,已经广泛用于Web上进行数据表示和数据交换。近年来,随着XML技术的广泛应用,Web上已经涌现出了大量XML数据,如何更有效的管理这些海量XML数据已逐渐成为研究热点,近年来人们在XML文档的压缩存储、查询等方面做了广泛的研究,但是随着XML数据的不断增加,人们对于XML数据的存储和查询提出了更高的要求。因此,本文在XML数据的压缩存储及压缩数据的直接查询方面进行了探索。 首先本文对现有的基于区间编码和基于前缀编码的压缩编码方案进行了深入研究,并对典型的编码方案进行了详细介绍并分析。提出了一种基于区间编码的压缩编码方案,并详细介绍了本文编码采用的存储结构及编码算法。该编码方案不仅能够节省存储空间,实现XML数据的压缩,而且可以快速的判断结点的位置关系,提高查询处理的效率。 然后基于XSMR编码方案提出了两大类共四种查询处理算法,并给出了详细的核心算法及示例。本文的查询算法处理包括路径查询算法和复杂查询算法两大类,其中路径查询处理包括简单路径查询和单源路径查询两种,其他复杂的路径查询可以拆分为这两种路径进行处理;复杂查询算法包括轴查询和含值查询两种。以上四种基于压缩编码的查询处理算法,充分利用了压缩编码方案的特点,有效的提高了压缩XML数据的查询效率。 最后给出了原型系统的设计与实现框架,并给出了实验对比,从压缩性能和查询性能验证了本文提出的压缩存储方案及查询处理算法的有效性,并对压缩查询性能进行了分析。实验表明,本文给出的压缩存储及查询处理算法取得了较好的实验效果,能够支持大多数常用的查询处理操作。
[Abstract]:Since the advent of XML technology, it has been widely used for data representation and data exchange on the Web. In recent years, with the wide application of XML technology, a large number of XML data have emerged on the Web. How to manage these massive XML data more effectively has gradually become a research hotspot. In recent years, people have compressed the storage of XML documents. Query has been widely studied, but with the increasing of XML data, people put forward higher requirements for storing and querying XML data. Therefore, this paper explores the compression storage of XML data and the direct query of compressed data. Firstly, the existing compression coding schemes based on interval coding and prefix coding are deeply studied, and the typical coding schemes are introduced and analyzed in detail. A compression coding scheme based on interval coding is proposed, and the storage structure and coding algorithm used in this paper are introduced in detail. This coding scheme can not only save storage space and realize the compression of XML data, but also can quickly judge the location relationship of nodes and improve the efficiency of query processing. Then, based on XSMR coding scheme, two kinds of four query processing algorithms are proposed, and a detailed core algorithm and an example are given. The query algorithm processing in this paper includes two categories: path query algorithm and complex query algorithm. Path query processing includes simple path query and single source path query. Other complex path queries can be split into these two paths for processing; complex query algorithms include axis query and value query. The above four query processing algorithms based on compression coding make full use of the characteristics of compression coding scheme and effectively improve the query efficiency of compressed XML data. Finally, the design and implementation framework of the prototype system is given, and the experimental results are compared. The compression storage scheme and query processing algorithm are verified from the compression performance and query performance, and the performance of compressed query is analyzed. The experimental results show that the compression storage and query processing algorithms presented in this paper have achieved good experimental results and can support most commonly used query processing operations.
【学位授予单位】:福建师范大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333
【参考文献】
相关期刊论文 前10条
1 骆吉洲;李建中;;一种索引结构的压缩存储及其查询处理技术[J];计算机工程与应用;2007年08期
2 周军锋;孟小峰;;XML关键字查询处理研究[J];计算机学报;2012年12期
3 富丽贞;孟小峰;;有向图上的广义可达性查询处理方法[J];计算机科学与探索;2012年07期
4 王宏志;李建中;骆吉洲;;XML数据流上的高效聚集算法[J];软件学报;2008年08期
5 王宏志;骆吉洲;李建中;;图结构XML文档上子图查询的高效处理算法[J];软件学报;2009年09期
6 刘丹;陆伟;张宓;;XML结构化检索研究及实现[J];现代图书情报技术;2009年03期
7 刘丹;孔少华;陆伟;;XML检索研究综述[J];现代图书情报技术;2010年04期
8 刘丹;;基于XML的中文博硕士论文检索系统设计及实现[J];现代图书情报技术;2010年05期
9 陆伟;;元素级XML检索模型构建的关键问题与解决方案研究[J];中国图书馆学报;2007年06期
10 陆伟;张宓;刘丹;;基于XML文本片段的图像检索实现与评价[J];中国图书馆学报;2009年02期
,本文编号:2034681
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2034681.html