当前位置:主页 > 科技论文 > 软件论文 >

基于压缩位图索引的RDF数据存储与管理

发布时间:2018-06-17 23:27

  本文选题:RDF + 数据存储 ; 参考:《北京交通大学》2017年硕士论文


【摘要】:随着资源描述框架(Resource Description Framework,RDF)在各个领域的广泛应用,如何对海量RDF数据的存储与管理成为近年来的研究热点。现有的RDF数据管理系统大都采用传统的关系型数据库来存储数据,这种方式已难以高效地管理海量数据。如何设计一种高性能、可扩展为分布式的RDF数据存储和管理系统具有重要意义。本文设计了一种基于位图索引的RDF数据存储方案,并实现了基于该存储方案的RDF管理系统,最后通过系统测试验证了该方案的可行性与有效性。本文研究工作主要包括以下几个方面。(1)总结了现有的RDF数据存储方案。分析了当前主流的数据存储技术及RDF数据存储模型的优缺点,并对其进行了简单的分析与总结。(2)提出了一种基于位图索引的高扩展性底层存储方案。该方案在持久层将RDF数据文件分块进行顺序存储,实现了系统的可扩展性;同时为RDF关键词构建基于压缩位图的查询索引,降低了运行时内存资源消耗。(3)设计了基于本方案的数据查询算法。该算法能够充分利用位图索引逻辑计算的性能优势,保证了高效的查询效率。(4)实现了基于本方案的RDF数据存储和查询系统fishdb,并采用测试数据集在单机伪分布式系统环境下对该系统进行了性能测试。与开源RDF管理系统Google Cayley的相比,fishdb能够以较小的内存资源消耗为代价换取较高的查询性能提升,验证了本方案的可行性和有效性。
[Abstract]:With the wide application of Resource description Framework (RDF) in various fields, how to store and manage massive RDF data has become a hot topic in recent years. Most of the existing RDF data management systems use traditional relational databases to store data, which is difficult to manage mass data efficiently. How to design a high performance and extensible RDF data storage and management system is of great significance. In this paper, a RDF data storage scheme based on bitmap index is designed, and the RDF management system based on this storage scheme is implemented. Finally, the feasibility and effectiveness of the scheme are verified by system test. The main work of this paper includes the following aspects: 1) summarize the existing RDF data storage scheme. This paper analyzes the advantages and disadvantages of the current mainstream data storage technology and RDF data storage model, and gives a simple analysis and summary of the RDF data storage model. In the persistence layer, the RDF data file is stored sequentially, and the system scalability is realized. At the same time, the query index based on compressed bitmap is constructed for the RDF keyword. The data query algorithm based on this scheme is designed. This algorithm can make full use of the performance advantage of bitmap index logic computing. The RDF data storage and query system fishdbbased on this scheme is implemented, and the performance of the system is tested by using the test data set in the single machine pseudo-distributed system environment. Compared with the open source RDF management system Google Cayley, fishdb can improve the query performance at the cost of less memory resource consumption, which verifies the feasibility and effectiveness of this scheme.
【学位授予单位】:北京交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP333;TP315

【参考文献】

相关硕士学位论文 前2条

1 朱敏;基于HBase的RDF数据存储与查询研究[D];南京大学;2013年

2 金强;基于HBase的RDF存储系统的研究与设计[D];浙江大学;2011年



本文编号:2032927

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2032927.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户5a644***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com