异构知识仓库数据集成关键技术的研究与实现
发布时间:2018-10-20 14:47
【摘要】:高校机构知识库是知识仓库的一种主流应用,自发展以来,在学术传播、电子出版、长期保存、知识管理、促进教育、科研评价、共享利用等方面发挥了重要作用。但是国内高校机构知识库彼此独立,学术资源无法共享,阻碍了学术资源的利用与传播。随着网络数字化文献信息资源的需求日益增大,各高校机构知识库的一个个资源孤岛已经不能满足需求,资源的共建共享,为用户提供一站式服务成为必然,本课题的目的是实现各机构知识库间互联互通,从而为用户提供资源集成的统一获取服务。课题采用数据复制的方式建立集中式数据仓库,将各机构知识库中的数据收割到数据集成中心。本课题采用OAI-PMH协议和METS协议,将对象数据封装成METS文档嵌入OAI记录中,实现学术资源元数据和对象数据的联合收割。同时为了提高对象数据收割的灵活性,提供两种元数据类型etd和etds,他们都对OAI记录中使用的元数据格式进行定义,区别在于etds元数据类型只进行元数据收割。在数据收割系统中,各机构作为数据发布方借助OAICat实现OAI记录的发布,数据集成中心作为收割方调用各机构知识库的OAI接口收割OAI记录,从中解析出资源条目元数据和对象数据,并保存到数据集成中心数据仓库中。数据集成中心在完成元数据和对象数据收割后,对仓库中某一学科分类的学术资源进行语义分析,获得领域关键词库,并在此基础上实现学术资源间的语义关联,用户在浏览学术资源时通过点击已建立关联的关键词进行快速的资源发现,从而为用户提供个性化知识服务。
[Abstract]:Institutional knowledge base is a mainstream application of knowledge warehouse. Since its development, it has played an important role in academic communication, electronic publishing, long-term preservation, knowledge management, promotion of education, evaluation of scientific research, and sharing and utilization. However, the institutional knowledge base is independent of each other, and academic resources can not be shared, which hinders the utilization and dissemination of academic resources. With the increasing demand of the network digital document information resources, the resource isolated islands of the institutional knowledge base of each university can no longer meet the demand. The co-construction and sharing of the resources becomes inevitable, which provides the one-stop service for the users. The purpose of this paper is to realize the interconnection and interworking among the institutional knowledge bases, thus providing users with unified access services for resource integration. A centralized data warehouse is established by data replication, and the data from each institutional knowledge base is harvested to the data integration center. In this paper, OAI-PMH protocol and METS protocol are used to encapsulate object data into METS document and embed it in OAI record to realize the joint harvesting of metadata and object data of academic resources. In order to improve the flexibility of object data harvesting, two metadata types, etd and etds, are provided to define the metadata format used in OAI records. The difference is that etds metadata types only harvest metadata. In the data harvesting system, each organization as the data publisher realizes the release of OAI records by means of OAICat, and the data integration center acts as the harvester to call the OAI interface of the institutional knowledge base to harvest OAI records. The metadata and object data of resource items are parsed and saved to the data warehouse of data integration center. After reaping metadata and object data, the data integration center carries on the semantic analysis to the academic resources classified by a certain discipline in the warehouse, obtains the domain keyword database, and realizes the semantic association between the academic resources on this basis. When users browse academic resources, they can quickly discover the resources by clicking on the related keywords, thus providing users with personalized knowledge services.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.13
本文编号:2283478
[Abstract]:Institutional knowledge base is a mainstream application of knowledge warehouse. Since its development, it has played an important role in academic communication, electronic publishing, long-term preservation, knowledge management, promotion of education, evaluation of scientific research, and sharing and utilization. However, the institutional knowledge base is independent of each other, and academic resources can not be shared, which hinders the utilization and dissemination of academic resources. With the increasing demand of the network digital document information resources, the resource isolated islands of the institutional knowledge base of each university can no longer meet the demand. The co-construction and sharing of the resources becomes inevitable, which provides the one-stop service for the users. The purpose of this paper is to realize the interconnection and interworking among the institutional knowledge bases, thus providing users with unified access services for resource integration. A centralized data warehouse is established by data replication, and the data from each institutional knowledge base is harvested to the data integration center. In this paper, OAI-PMH protocol and METS protocol are used to encapsulate object data into METS document and embed it in OAI record to realize the joint harvesting of metadata and object data of academic resources. In order to improve the flexibility of object data harvesting, two metadata types, etd and etds, are provided to define the metadata format used in OAI records. The difference is that etds metadata types only harvest metadata. In the data harvesting system, each organization as the data publisher realizes the release of OAI records by means of OAICat, and the data integration center acts as the harvester to call the OAI interface of the institutional knowledge base to harvest OAI records. The metadata and object data of resource items are parsed and saved to the data warehouse of data integration center. After reaping metadata and object data, the data integration center carries on the semantic analysis to the academic resources classified by a certain discipline in the warehouse, obtains the domain keyword database, and realizes the semantic association between the academic resources on this basis. When users browse academic resources, they can quickly discover the resources by clicking on the related keywords, thus providing users with personalized knowledge services.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.13
【参考文献】
相关期刊论文 前9条
1 董文鸳;袁顺波;;全球学科知识库发展现状扫描[J];图书馆;2015年04期
2 黄如花;邱春艳;;数字信息资源长期保存元数据的管理策略[J];信息资源管理学报;2014年02期
3 温浩宇;李京京;;大数据时代的数字图书馆异构数据集成研究[J];情报杂志;2013年09期
4 聂华;韦成府;崔海媛;;CALIS机构知识库:建设与推广、反思与展望[J];中国图书馆学报;2013年02期
5 赵卫东;;信息化中的数据集成[J];石油石化物资采购;2010年04期
6 钟秋燕;;数据集成技术综述[J];电脑知识与技术;2008年24期
7 王香莲;;高校图书馆构建机构知识库的问题及对策分析[J];台州学院学报;2008年02期
8 袁小一,苏智星;浅谈特色数据库元数据的建立[J];晋图学刊;2005年05期
9 李春旺;张晓林;;复合数字对象研究[J];情报学报;2004年04期
相关硕士学位论文 前3条
1 钟莉;基于关联数据的图书馆信息资源整合研究[D];浙江大学;2014年
2 赵白玉;基于汉语依存句法分析的主观题自动评分研究[D];湖南大学;2012年
3 王宏峥;中国高校机构知识库建设现状及对策研究[D];湘潭大学;2012年
,本文编号:2283478
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2283478.html