鼠疫耶尔森氏菌知识库的建立
发布时间:2018-06-12 14:25
本文选题:鼠疫耶尔森氏菌 + 文献 ; 参考:《中国人民解放军军事医学科学院》2017年硕士论文
【摘要】:1.研究背景及目的鼠疫耶尔森氏菌(以下称鼠疫菌)是鼠疫的病原菌,在历史上曾引起过三次大流行。二战期间,日本侵略者展开惨绝人寰的细菌战,鼠疫菌曾被并用作生物战剂对我国实施攻击目前,鼠疫在世界各地时有病例发生,是影响全世界的公共卫生问题。我国现仍有12块鼠疫自然疫源地,分布在19个省(区),占国土面积的15%左右。加强鼠疫菌研究对鼠疫的控制与防治以及应对生物恐怖威胁具有重大现实意义。随着生物学技术的发展,围绕鼠疫菌所开展的基础科学和应用研究越来越多。这些研究工作产生了大量实验数据和文献资料,散落存放在世界各地的信息数据库中。通过中国知网及维普等中文文献查阅网站进行“鼠疫耶尔森氏菌数据库”及“鼠疫菌知识库”等关键词的检索,虽然与该菌有关的研究文献已有数千篇,但未发现该菌综合知识信息整合工作的文章发表。通过Pub Med数据库进行文献回溯和Google scholar检索网等进行检索,发现国际上有多个鼠疫耶尔森氏菌相关的数据库,如Gen Bank数据库,存储了鼠疫菌的全基因组序列和注释信息以及原始测序数据;MLVA数据库,收录了鼠疫菌的可变数目串联重复序列位点信息及菌株背景信息;CRISPR数据库,可检索到鼠疫菌的成簇规律间隔短回文重复序列位点序列信息;其它大型公开数据库,如Bio GRID、DDBJ和EMBL等存储了鼠疫菌蛋白质—蛋白质及基因相互作用等相关信息。这些数据库功能主要集中于鼠疫菌分子及遗传多样性数据的收集和管理,而并没有将鼠疫菌相关文献知识进行组织、整合,并有序储存于一个独立的数据库平台上。本研究拟收集整理关于该菌的学术论文、论著、新闻及疫情信息等相关知识,加以整合,建立便捷的鼠疫菌知识库系统,为用户提供查询、浏览和下载服务,以达到以下研究目的:(1)实现已有鼠疫菌相关知识和数据的集中存储管理;(2)查询便捷,提高现有文献信息的使用效率;(3)自动更新,实时获取鼠疫菌研究相关资讯。最终为鼠疫菌相关科研提供更为完整方便的知识和数据支撑,也为其他病原的知识库建立工作提供可参考的范例。2.研究方法2.1数据收集与整合通过对EndNote、Reference Manager、Biblioscape及Note Express等文献管理软件功能的调研,根据本课题拟收集中外文文献进行整理并筛选导出的需求,最终选择End Note X5文献管理软件进行文献的收集整理工作。在该软件中设置检索关键词为“Black death”、“Yesinia”、“Plague”及“Yesinia pestis”,通过对Pub Med数据库在线检索,收集筛选出相关外文文献题录。通过Find full text功能项自动获取到部分文献全文并链接至原文,其余外文文献的全文通过访问Sci-Hub网站手动检索收集。因End Note无中文搜索功能,中文文献题录则主要通过中国知网检索“黑死病”、“鼠疫”及“鼠疫耶尔森菌”等关键词获取,并进行人工筛选后将相关文献导入End Note文献管理软件,中文文献全文通过中国知网和万方数据收集。所有PDF全文文档通过Reference→File Attachment功能人工导入End Note文献库与原文相关联。收集工作结束后,使用Find Duplicates功能对全部文献收集结果进行查重和去冗余整理,最终以Show All Fields的格式导出,将所有文献信息存储于一个独立的TXT文件。全文PDF文档储存在End Note自定义的原文件夹,路径不变。2.2知识库系统的构建本研究通过Perl及PHP等计算机语言规范数据格式、构建实体关系模型、搭建系统和开发网页;选择中小型网站开发中常用的Apache+PHP+My SQL优势组合进行网站的动态开发,逐步完成知识库系统的构建。为了使各部分数据顺利架构在服务器上,通过编写Perl脚本对存储文献信息的TXT文本进行解析,生成一个可用Excel程序打开的文本列表,该列表以制表符分隔每个字段并为每行文献分配一个唯一的整数型标识符。同时编写Perl脚本,将PDF文件按照整数型标识符重新命名后,转移到另外统一存放PDF的文件夹中。经过数据的规范化处理,论文和著作被分割成实体(Entity)并分配属性(Attribute)。根据实体与实体及实体与属性之间的对应关系,构建实体关系模型(entity-relationship model),转化成表格并对应建立在配置好的My SQL 5.7数据库中,通过php My Admin工具,将文献数据导入到My SQL数据库并与全文PDF文件建立一一对应关系。数据库搭建完成后,在Apache网页服务器上开发基于web接口的动态网页,形成数据库网站系统。通过Java Script和Ajax快速处理服务器端与客户端的交互,运用Perl语言处理在知识库网页上返回到后台的各种数据。系统搭建完成后进行试运行与调试排错,保证系统运行顺畅。3.研究结果本研究架构的鼠疫菌知识库由文献信息模块、检索模块和新闻模块构成。用户可通过浏览器从Web端访问本库,访问网址为:http://101.201.51.148/ypkd/。知识库主页分上下两部分排版,由上方的标题栏、导航栏、简介、数据概览、快捷搜索入口及下方的新闻展示版块构成。信息搜索截止到2016年10月30日,文献信息模块收录鼠疫菌相关文献7183篇,书籍23部,均提供了URL链接;其中4620篇论文收集到PDF全文。用户进行普通检索或高级检索时,可通过设置标题、摘要、关键词、期刊、作者、时间等数据项获取相关信息,并可将所得文献按出版日期、作者及杂志名称进行排序浏览。标题下方将列出该论著的简要信息供用户初步浏览,继续点击标题即可阅读此文献的详细信息,包括文献的标题、作者、摘要、关键词、PMID号、全文以及Pub Med链接等,点击本库提供的PDF链接可在线浏览或下载全文。新闻版块中通过编制好的网络爬虫,自动实时抓取互联网最新的鼠疫菌相关信息,并按序展示;用户可通过超链接获取网络信息全文,保证了知识库的前沿性和完整性。4.结论本研究建立的知识库系统,内容丰富完整,界面简洁合理,使用快捷方便。该库将Pub Med和知网等网站的鼠疫菌相关信息全面整合起来,收录了已发表的几乎全部鼠疫菌相关文献信息资源,有利于知识的整体保存和利用。用户通过输入关键词等进行常规或组合检索,准确地在页面上获取相关文献列表并可按需排序;提供的文献全文,为研究者节省了多方查找资源、调研文献的时间。本网站通过网络爬虫,将最新鼠疫菌相关网络信息进行自动实时更新,排序展示在新闻版块,保证了网站的及时性和时效性。用户通过提供的超链接,可快速浏览全文,掌握信息详情。本知识库的建立,为科研工作者提供了可靠的鼠疫菌知识共享平台,为鼠疫疫情的相关信息提供了及时的动态来源,也为其他病原微生物知识库的建立提供了可借鉴的新模型。
[Abstract]:1. research background and objective Jerson S (Yersinia pestis) is the pathogen of plague, which has caused three pandemics in history. During World War II, Japanese aggressors launched a global bacterial war, and Yersinia pestis had been used as a biological warfare agent to attack our country, and the plague occurred in all parts of the world and was affected. There are still 12 natural plague foci of plague in China, which are still distributed in 19 provinces (regions), accounting for about 15% of the land area. It is of great practical significance to strengthen the study of Yersinia pestis for the control and control of plague and to the threat of bioterrorism. With the development of biological techniques, the basic science around Yersinia pestis is developed. More and more research and application research have produced a large number of experimental data and literature, scattered in the information database stored in all parts of the world. Through the Chinese literature review of Chinese knowledge network and VIP and other Chinese websites, the search for the key words, such as "the Yersinia Jerson Prand database" and "the Yersinia pestis knowledge library", has been retrieved, although with the bacteria There are thousands of relevant research documents, but no articles on integrated knowledge and information integration have been published. Through Pub Med database backtracking and Google scholar retrieval network, we found that there are many databases related to plague Jerson Prand, such as Gen Bank database, to store the whole gene of Yersinia pestis. The sequence and annotation information and the original sequencing data; the MLVA database, which included the variable number of tandem repeat loci and the background information of the Yersinia pestis; the CRISPR database can retrieve the sequence information of the repeated sequences of the short palindrom of the tufts of Yersinia pestis; the other large public databases, such as Bio GRID, DDBJ and EMBL The functions of the protein and gene interaction of Yersinia pestis are stored. These database functions mainly focus on the collection and management of the Yersinia pestis molecules and genetic diversity data, and have not organized the related literature knowledge of Yersinia pestis, integrated and stored in an independent database platform. Collects and collects the relevant knowledge about the bacteria, the treatise, the news and the information of the epidemic situation, and integrates them, establishes a convenient knowledge base system for the Yersinia pestis, and provides the users with inquiries, browsing and downloading services to achieve the following research purposes: (1) to realize the centralized storage and management of the related knowledge and data of the plague bacteria; (2) the query is convenient and improved. The use efficiency of the existing literature information; (3) automatic updating, real-time acquisition of the related information of Yersinia pestis research. Finally, it provides more complete and convenient knowledge and data support for the related scientific research of Yersinia pestis, and also provides reference for the establishment of the knowledge base for other pathogens,.2. 2.1 data collection and integration through to EndNote, Referenc E Manager, Biblioscape and Note Express and other documents management software function research, according to this topic collection and collection of foreign documents to collate and screen export requirements, and finally select the End Note X5 document management software for the collection and collation of the literature. Lague and Yesinia pestis, through online retrieval of Pub Med database, collection and selection of relevant foreign literature titles. Through the Find full text function item, a part of the text is automatically obtained and linked to the original text. The full text of the rest of the foreign literature is manually retrieved through the access to the Sci-Hub website. Because End Note has no Chinese search function, In Wen Wenxian's title, the key words of "Black Death", "plague" and "Yersinia pestis" were retrieved mainly through the Chinese knowledge network, and the related literature was introduced into End Note document management software after artificial screening. The full text of Chinese literature was collected through Chinese knowledge net and Wanfang Data. All PDF full text documents were passed through Reference to File Att The achment function manually import the End Note library is associated with the original. After the collection, the whole document collection results are checked and redundant, using the Find Duplicates function, and finally exported from the Show All Fields format, and all the document information is stored in a single independent TXT file. The full text PDF document is stored in End Note. The definition of the original folder, path invariant.2.2 knowledge base system construction, the research through the Perl and PHP computer language specification data format, build entity relationship model, build system and develop web pages; select the common Apache+PHP+My SQL advantage combination in the small and medium scale web site development to carry out the dynamic development of the web site, and gradually complete the knowledge base system In order to construct the data smoothly on the server, the TXT text that stores the document information is parsed by writing the Perl script to generate a text list open by the Excel program, which separates each field from the tabs and assigns a unique integer identifier for each line of documents. At the same time, the Perl is written. The script, after the PDF file is renamed according to the integer identifier, is transferred to another unified PDF folder. After normalization of the data, the papers and works are divided into entities (Entity) and assigned attributes (Attribute). The entity relationship model (entity-re) is constructed according to the corresponding relationship between entity and entity and entity and nature (entity-re Lationship model), converted into a table and corresponding to the configured My SQL 5.7 database, through the PHP My Admin tool, import the document data into the My SQL database and establish a one-to-one correspondence with the full text PDF file. After the completion of the database, the dynamic web page based on the web interface is developed on the Apache web server to form the data. Library website system. Through Java Script and Ajax to quickly deal with the interaction between the server side and the client, and use the Perl language to process all kinds of data back to the background on the knowledge base page. After the system is built, the test run and debug are wrong, and the knowledge base of the Yersinia pestis to ensure that the system runs smoothly and the result of.3. research results in the literature of the literature. The user can access the library from the Web side through the browser. The user can access the library from the Web side, access the web site as follows: the http://101.201.51.148/ypkd/. knowledge base page is divided into two parts, from the top title bar, the navigation bar, the introduction, the data overview, the quick search entrance and the press display section below. The information search cut-off By October 30, 2016, the document information module included 7183 articles related to Yersinia pestis and 23 books, all of which provided URL links; 4620 of them collected the full text of PDF. When users were searching for ordinary or advanced retrieval, they could obtain relevant information by setting headlines, abstracts, keywords, periodicals, authors, time and other data items. The title will be sorted by the publication date, the author and the magazine name. Below the title will list the brief information of the book for the user to browse, and continue to click the title to read the details of the document, including the title of the literature, the author, the abstract, the key words, the PMID number, the full text and the Pub Med link, and click the PDF link provided by this library. Browse or download the full text online. Through the compilation of good web crawlers in the news section, the information of the latest Yersinia bacteria related to the Internet is automatically captured and displayed in sequence. Users can get the full text of the network information through hyperlinks and guarantee the knowledge base system established by the.4. summary of the knowledge base, which is rich in content. The interface is simple and reasonable and easy to use. The library integrates the related information of Yersinia pestis on websites such as Pub Med and the web site. It contains the published information resources related to almost all plague bacteria, which is beneficial to the whole preservation and use of knowledge. A list of related documents can be obtained on the surface and can be sorted in accordance with the needs. The full text of the document provides a full text for the researchers to save many parties to find resources and investigate the time of the literature. This website updates the latest plague related network information automatically through the web crawler, and the sort is displayed in the news section, which ensures the timeliness and timeliness of the website. This knowledge base provides a reliable platform for the knowledge sharing of the Yersinia pestis, provides a timely and dynamic source for the related information of the plague epidemic, and provides a new model for the establishment of other pathogenic microorganism knowledge base.
【学位授予单位】:中国人民解放军军事医学科学院
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:R516.8
【参考文献】
相关期刊论文 前10条
1 蒋鸿标;;三大中文期刊全文数据库质量述评[J];现代情报;2015年09期
2 瞿海燕;肖仙桃;郑文江;陈松丛;;基于用户需求的参考文献管理软件的竞争力及发展趋向研究[J];情报理论与实践;2014年12期
3 王曼;;设计动态网站的最佳组合:Apache+PHP+MySQL[J];电子制作;2014年07期
4 姚晓恒;周晓磊;鞠成;刘振才;;全国2005~2012年鼠疫疫情数据网报率结果分析[J];中国地方病防治杂志;2013年04期
5 崔玉军;宋亚军;杨瑞馥;;鼠疫耶尔森氏菌的进化研究:从系统发育学到系统发育基因组学[J];中国科学:生命科学;2013年01期
6 张敏;;浅析万方、维普、CNKI三大全文数据库[J];河南图书馆学刊;2012年01期
7 谢忠厚;;侵华日军细菌战研究述论[J];抗日战争研究;2011年03期
8 海荣;;中国鼠疫自然疫源地研究进展[J];中国媒介生物学及控制杂志;2011年04期
9 王梅;海荣;;鼠疫噬菌体研究概述[J];中国媒介生物学及控制杂志;2011年03期
10 谢群;;文献管理软件的功能层次划分理论研究——以Endnote为例[J];现代情报;2008年04期
,本文编号:2009981
本文链接:https://www.wllwen.com/yixuelunwen/chuanranbingxuelunwen/2009981.html
最近更新
教材专著