基于RDF的云制造资源数据存储及检索方法的研究与实现
发布时间:2018-01-04 05:11
本文关键词:基于RDF的云制造资源数据存储及检索方法的研究与实现 出处:《北京交通大学》2013年硕士论文 论文类型:学位论文
【摘要】:随着语义网技术的不断发展与成熟,资源描述框架RDF (Resource Description Framework)被应用于越来越多的领域中,然而随着全球全面进入信息化,数据爆炸式的增长,大规模RDF数据的存储及检索成为行业数据整合和数据分析的关键技术,如何提高RDF数据存储的可扩展性、数据检索的高效性对于目前web服务管理、数据管理、云计算及行业数据共享及整合具有重要的现实意义。首先,本文对RDF数据的存储方法进行了比较,针对传统的关系数据库技术难以应对海量数据存储问题,提出基于Hbase的RDF存储方案,存储方案中表的逻辑存储结构采用动态列存储数据,使其可以在处理RDF可能出现的多值问题时具有更高的效率。 然后,本文针对传统基于关键字的查询无法得到全面准确信息的问题,对RDF查询语言——SPARQL与HBase之间的查询接口进行了研究与设计。提出了语义扩展的查询方法及核心算法。实现了基于语义与基于关键字相结合的查询。 再次,针对传统的数据集中式处理方式难以应对快速信息检索问题,本文在查询逻辑之上增加了索引机制以及并行查询机制对查询效率进行了优化。引入索引机制可以减少查询时所要遍历的节点数,引入并行化查询可以使一条查询在各节点之间并行进行查询,从而提高查询效率。 最后,本文通过对比试验,对无索引与并行机制的查询方案与有索引与并行机制的查询方案进行了对比;并对实验结果进行分析,证明在数据量较大的情况下,有索引与并行机制的查询方案要优于无索引与并行机制的查询方案。
[Abstract]:With the development of semantic web technologies and mature, resource description framework RDF (Resource Description Framework) has been applied in more and more areas, however, as the world entered the information, the explosive growth of data, massive RDF data storage and retrieval become the key technology industry data integration and data analysis, how to improve the RDF data storage scalability, efficient retrieval of data for the current web service management, data management, cloud computing and industry data sharing and integration has important practical significance. Firstly, the storage method of RDF data were compared to the traditional relational database technology to deal with the problem of massive data storage, RDF storage scheme based on Hbase, the logical storage structure storage scheme in the columns of the table data is stored by the dynamic, so that it can more value may appear in the RDF ask The problem is more efficient.
Then, according to the traditional keyword based queries can not be comprehensive and accurate information, the RDF query language, query interface between SPARQL and HBase is studied and designed. A query method of semantic extension and core algorithm. Based on the combination of semantic and keyword based query.
Again, according to the data of the traditional centralized processing technology to cope with rapid information retrieval, the query logic increases the indexing mechanism and parallel query mechanism to optimize the query efficiency. The number of nodes is introduced to reduce the indexing mechanism can traverse the query, introduce the parallel query can make a parallel query in the query between the nodes, and improve query efficiency.
Finally, through comparative test, query scheme on index and parallel mechanism and query scheme index and parallel mechanism are compared; and the experimental results are analyzed, in the case of large data, query scheme query scheme index and parallel mechanism is better than no index and parallel mechanism.
【学位授予单位】:北京交通大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333
【参考文献】
相关期刊论文 前5条
1 谢桂芳;;SPARQL-一种新型的RDF查询语言[J];湘南学院学报;2009年02期
2 李伯虎;张霖;王时龙;陶飞;曹军威;姜晓丹;宋晓;柴旭东;;云制造——面向服务的网络化制造新模式[J];计算机集成制造系统;2010年01期
3 李勇;张志刚;;基于本体语义检索技术研究[J];计算机工程与科学;2008年04期
4 李伯虎;张霖;任磊;柴旭东;陶飞;王勇智;尹超;黄培;赵欣培;周祖德;;云制造典型特征、关键技术与应用[J];计算机集成制造系统;2012年07期
5 李伯虎;张霖;柴旭东;;云制造概论[J];中兴通讯技术;2010年04期
,本文编号:1377183
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1377183.html