当前位置:主页 > 科技论文 > 搜索引擎论文 >

生命科学知识网络系统构建及网络信息分析

发布时间:2018-04-16 12:00

  本文选题:生物信息数据库 + 网络分析 ; 参考:《浙江大学》2012年博士论文


【摘要】:随着高通量数据分析数据的大量产生,生物信息数据库及系统生物学在生命科学研究中越来越重要。大量的数据库和网络服务又使得使用者面临被数据淹没的危险,此外如何有效的组织和利用这些信息也成为生物信息研究的重点。为了构建一个统一的生物信息框架来有效的统一和组织以及分析这些不同来源、类型的数据和信息,我们对生物信息的数据结构和信息构成进行了基础的分析。在对原始数据处理的基础上,本研究设计了以概念为节点,以关系为连线的数据框架。对海量生命科学概念构建统一的本体库,构建了新的基于语义的文献搜索引擎。我们还开发了一套新的网络分析算法,结合我们标准化后的信息分值,我们可以快速的计算并排序最相关的概念和可能的信息通路,最终提供可能的生物学解释。在进行的基础研究和数据处理基础上,我们开发了名为BioPubInfo(http://www.biopubinfo.org)的生命科学知识引擎,包含文献相关搜索引擎和网络知识分析引擎。目前网络知识分析引擎已初步完成了界面的开发和后台的设置,文献相关搜索引擎还在进一步完善中。在对生命科学海量数据的分析处理过程中,我们设计和摸索出了一套分析和处理海量数据,并利用数据的网络结构搜索和预测新知识的算法。新的算法在充分利用图形数据库与图形数据结构框架优势的基础上实现了对亿级数量概念关系网络的实时分析,并在此基础上对人类疾病和拟南芥、水稻相关性状的候选基因进行了预测。基于获得概念网络及其理念,我们对水稻的表型与基因的关系进行了预测,并整合其他信息建立了QTXtoGene的分析平台,后续将加入更多的物种和性状。在对全局数据整合的过程中,我们还分析了拟南芥的盐胁迫表达调控网络以及基因组进化和水平转移等几个方面的问题。构建了拟南芥根部在盐胁迫下不同时间的表达调控网络,采用了新的水平基因检测方法,分析并找到了家蚕基因组中10个水平转移基因。同时将共有信息的方法用于分析流感病毒受体蛋白不同位点之间的关系网络。
[Abstract]:With the production of high-throughput data analysis data, biological information database and system biology are becoming more and more important in life science research.A large number of database and network services make users face the risk of being flooded by data. In addition, how to organize and utilize this information effectively has become the focus of biological information research.In order to construct a unified biological information framework to effectively unify and organize and analyze these different sources, types of data and information, we analyze the data structure and information structure of biological information.On the basis of raw data processing, a data frame based on concept and relation is designed.This paper constructs a unified ontology library for mass life science concepts, and constructs a new semantic based literature search engine.We have also developed a new network analysis algorithm. Combined with our standardized information scores, we can quickly calculate and sort the most relevant concepts and possible information pathways, and ultimately provide possible biological explanations.On the basis of basic research and data processing, we have developed a life science knowledge engine called BioPubInfoN http: / / www.biopubinfo.org, which includes literature related search engines and web knowledge analysis engines.At present, the network knowledge analysis engine has initially completed the development of the interface and background settings, literature related search engines are still in the process of further improvement.In the process of analyzing and processing massive data in life sciences, we design and explore a set of algorithms for analyzing and processing massive data, and using the network structure of data to search and predict new knowledge.On the basis of taking full advantage of the advantages of graphic database and graphic data structure framework, the new algorithm realizes the real-time analysis of the concept relation network of billion quantity, and on the basis of this, it can analyze human diseases and Arabidopsis thaliana.Candidate genes for rice associated traits were predicted.Based on the concept network and its concept, we predict the relationship between phenotypes and genes of rice, and integrate other information to establish a QTXtoGene analysis platform. More species and traits will be added in the future.In the process of global data integration, we also analyzed several aspects of Arabidopsis thaliana, such as salt stress expression regulatory network, genome evolution and horizontal transfer.The regulation network of Arabidopsis thaliana root expression at different time under salt stress was constructed. A new horizontal gene detection method was used to analyze and find 10 horizontal transfer genes in Bombyx mori genome.At the same time, the common information method is used to analyze the relationship between different sites of influenza virus receptor protein.
【学位授予单位】:浙江大学
【学位级别】:博士
【学位授予年份】:2012
【分类号】:TP391.3;Q811.4


本文编号:1758782

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1758782.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户dce9e***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com