细菌必需基因团簇模型及最小基因集构建
发布时间:2018-07-11 16:39
本文选题:实验确定的必需基因 + 必需基因团簇 ; 参考:《电子科技大学》2015年博士论文
【摘要】:必需基因是生物体维持基本生命活动所不可缺少的基因。近来,细菌的必需基因集已经成为微生物学、医学、基因组学、生物信息学等学科的研究热点。由于必需基因的重要性,必需基因已成为合成生物学的基础,同时必需基因能成为抗菌药物设计的潜在靶标,且有助于理解生命的最早共同祖先。本学位论文,以必需基因为研究对象,提出了必需基因团簇模型,并构建了第一个必需基因团簇数据库(Database of cluster of essential gene,CEG)。基于必需基因团簇数据库,发展了必需基因算法和软件的实现(CEG_Match),并描绘了一个细菌的最小基因集蓝图和重构了最小代谢网络。以必需基因团簇数据库的物种为参考集,计算了2186个细菌的基因适应度,并构建了第一个细菌基因适应度数据库(IFIM)。具体如下:(1)我们首次提出必需基因团簇模型来存储必需基因,而不是像已有的必需基因数据库用单个基因形式存储。并构建了第一个必需基因团簇数据库,该模型(数据库)包含同源的必需基因簇。模型以实验确定了必需基因的16个菌株(15个物种)为对象,把在这些物种中具有相同功能的基因作为一个团簇,获得了932个包含2个必需基因以上的真实必需基因团簇,以及1929个只有1个必需基因的伪团簇。与现有的以单个基因模式存储必需基因的数据库不同,必需基因团簇数据库以团簇为单位存储必需基因。这将大大方便研究人员的使用,例如:基于模型(数据库)中的每个团簇的团簇大小信息,用户可以很方便地确定一个必需基因是多细菌物种中保守还是物种特异的。该模型(数据库)还收录了每个必需基因团簇基因(蛋白)与人类的保守性结果。利用必需基因数据库的必需基因团簇大小、与人类保守性等重要信息,研究人员可以进行进化和药物设计的相关研究。(2)基于提出的必需基因团簇模型,我们发展了一个必需基因预测的K-value算法并形成软件(CEG_Match)。该软件基于基因的功能同源性而不是基于序列的同源性。因此不需要对基因进行测序,只需要通过简单的实验确定功能就能预测基因必需与否。该软件使用简单,相比BLAST的同源搜索比对方法具有更低的伪正率,同时保持不低的准确度,且在运行时间上远远低于BLAST的同源搜索。(3)理解生物体的生存适应度对完整地理解微生物遗传和有效的药物设计十分重要。目前存在的必需基因数据库都仅提供实验确定的二进制必需性数据。我们集成了必需基因团簇数据中(CEG)的细菌的实验数据,并结合理论预测数据,提出了用连续性的数值来反映基因的必需性,构建了第一个微生物基因适应度数据库。该数据库涵盖了在CEG数据库中通过由单基因敲除和转座突变实验确定的11个细菌的基因适应度、1个酵母的实验基因适应度和2186个理论预测的基因适应度数据。研究发现理论预测的基因适应度与实验的基因适应度有显著的相关性,这说明理论预测的基因适应度与实验的基因适应度一样具有可靠性。并且用户可以友好地访问和浏览基因适应度数据库中的数据。基因适应度数据库作为第一个存储微生物基因适应度资源的数据库,该数据库有助于研究人员更好地理解微生物遗传和开发抗菌药物以降低致病菌的耐药性,特别针对缺少实验确定的基因适应度的物种。(4)最后,基于必需基因团簇数据库CEG,描绘了一个细菌最小基因集蓝图和重构了最小代谢网络。最小基因集对组装最小人工细胞非常重要,尽管有一些细菌的最小基因集已经被报道出来,但是这些被发表的最小基因集只满足自复制(繁殖)系统,或者局限的引入了代谢相关基因。为了获得一个更加可靠和完整的细菌最小基因集,相比传统的确定最小基因集策略,我们有以下系统的创新:以必需基因团簇数据库为基础,从实验确定的必需基因出发,提出一个半数保留法来确定保守基因,同时引入最小代谢网络重构以补全最小基因集。最终获得一个包含315个必需基因的最小基因集,其中157个基因参与最小代谢网络,涉及431个代谢反应。我们首次获得了一个同时满足自复制(繁殖)和自维持(代谢)两种系统的最小基因集。通过最小代谢网络重构,除了确认已经发现的20个关键代谢物外,我们新确定了5个关键代谢物。此外,发现在最小代谢网络中,高必需性基因更趋向于把其涉及的代谢物分配到多个反应中,预示着细菌在一个反应遭到破坏时,能保留更多的代谢物正常进行来降低致死风险。最后,本文讨论了最小基因集的应用领域:基于最小基因集,能够扩充现有的药物靶标数据库来发展新药物以降低细菌耐药性;提出了一个半从头合成策略来帮助设计合成一个具有广泛生物学应用的底盘细胞。综上所述,本文对细菌必需基因、最小基因集的研究做了一个较全面的探索,并应用于必需基因预测、药物靶标基因发现、合成生物学等研究上。本研究取得了一定进展,但仍有一些问题需要进一步深入研究。
[Abstract]:Essential genes are essential genes for the maintenance of basic life activities. Recently, the essential gene set of bacteria has become a hot spot in microbiology, medicine, genomics, bioinformatics and other disciplines. Due to the importance of essential genes, the essential genes have become the basis of biointegration, and the essential genes can become antibiosis. The potential target of drug design helps to understand the earliest common ancestor of life. This dissertation, based on the research object, proposes the essential gene cluster model and constructs the first essential gene cluster database (Database of cluster of essential gene, CEG). Based on the essential gene cluster database, the necessary base has been developed. According to the implementation of the algorithm and software (CEG_Match), the minimum gene set blueprint of a bacterium and the minimum metabolic network were reconstructed. The gene adaptation of 2186 bacteria was calculated by using the species of the essential gene cluster database as the reference set. The first bacterial gene adaptation database (IFIM) was constructed. The following is as follows: (1) we first The essential gene cluster model is proposed to store the essential genes rather than the existing essential gene databases in single gene form. The first essential gene cluster database is constructed, and the model (database) contains the homologous essential gene cluster. The model is used to determine the 16 strains of the essential genes (15 species) as the object. As a cluster of genes with the same function in these species, 932 real essential gene clusters containing more than 2 essential genes, and 1929 pseudo clusters with only 1 essential genes, are different from the existing database for storing essential genes in a single gene pattern, and the essential gene cluster database is used as a cluster. This will greatly facilitate the use of the researchers, for example: Based on the cluster size information of each cluster in the model (database), the user can easily determine whether a essential gene is conservative or species specific in a multi bacterial species. Human conservation results. Using essential gene cluster size of the essential gene database, and important information such as human conservatism, researchers can conduct related studies on Evolution and drug design. (2) based on the proposed cluster model of essential genes, we developed a K-value algorithm for the prediction of essential genes and formed a software (CEG_Match). The software is based on gene function homology rather than sequence based homology. So it does not need to be sequenced and only needs to be determined by simple experiments. The software is simple and has a lower false positive rate compared with the BLAST homologous search comparison method, while maintaining a low accuracy. And it is far lower than BLAST's homologous search at run time. (3) understanding the survival fitness of organisms is important for understanding the complete understanding of microbial inheritance and effective drug design. The existing essential gene databases provide only the binary necessary data for experimental determination. We integrate the essential gene cluster data (CEG). The experimental data of bacteria, combined with theoretical prediction data, proposed the necessity of using continuity values to reflect genes, and constructed the first microbial gene adaptation database. The database covers the genetic fitness of 11 bacteria determined in the CEG database through Dan Jiyin knockout and transposable mutation experiments, and the 1 yeast is real. Gene adaptation and 2186 theory predicted gene adaptation data. The study found that the predicted gene adaptation has a significant correlation with the experimental gene adaptation, which indicates that the predicted gene adaptation is as reliable as the genetic adaptation of the experiment. And the user can visit and browse the gene adaptation amicably. Data in the degree database. The gene adaptation database is the first database to store microbial adaptation resources. This database helps researchers to better understand the microbial inheritance and development of antimicrobial drugs to reduce the resistance of the pathogenic bacteria, especially for the species lacking the identified genetic adaptation. (4) last, In the essential gene cluster database CEG, a minimum gene set blueprint of bacteria and the reconfiguration of the minimal metabolic network are described. The minimum set of genes is very important for assembling the smallest artificial cells. Although the smallest set of genes has been reported, the smallest set of genes that have been published satisfies the self replicating (reproduction) system, or In order to obtain a more reliable and complete set of minimal bacterial genes, we have the following system innovation in order to obtain a more reliable and complete set of minimal genes. We have the following system innovation: Based on the essential gene cluster database, a half retention method is proposed to determine conservatism from the essential basis of experimental determination. At the same time, the minimum metabolic network reconfiguration was introduced to complete the minimum set of genes. Finally, a minimum set of genes containing 315 essential genes was obtained, of which 157 genes were involved in the minimum metabolic network and involved 431 metabolic reactions. We first obtained a minimum basis for both self replicating (reproduction) and self maintenance (metabolism) of two systems. In addition to identifying the 20 key metabolites that have been identified, we have identified 5 key metabolites in addition to identifying the 20 key metabolites that have been identified. In addition, it is found that in the minimum metabolic network, the high essential genes are more likely to assign their metabolites to multiple reactions, indicating that the bacteria can be retained when a response is destroyed. More metabolites are normally carried out to reduce the risk of death. Finally, this paper discusses the application field of the minimum gene set: Based on the minimum gene set, it is able to expand the existing drug target database to develop new drugs to reduce bacterial resistance; 1.5 ab initio synthesis strategy is proposed to help design and synthesize a broad biological response. In summary, this paper makes a more comprehensive study on the essential genes of bacteria and the minimum set of genes, and has been applied to the research of essential gene prediction, drug target gene discovery, synthetic biology and so on. Some progress has been made in this study, but some problems still need to be further studied.
【学位授予单位】:电子科技大学
【学位级别】:博士
【学位授予年份】:2015
【分类号】:Q78
【相似文献】
相关期刊论文 前3条
1 叶远浓;郭锋彪;;微生物必需基因的理论研究现状[J];遗传;2012年04期
2 沈露露;杜敏;林兴凤;蔡婷;王大勇;;嗅觉神经元AWA功能必需基因以胰岛素信号依赖的方式调控秀丽线虫的衰老(英文)[J];Neuroscience Bulletin;2010年02期
3 ;[J];;年期
相关会议论文 前2条
1 张春霆;;细菌必需基因研究与最小基因组[A];第五届全国生物信息学与系统生物学学术大会论文集[C];2012年
2 郭锋彪;宁绿文;黄健;林昊;张会雄;;新洋葱伯克霍尔德氏菌AU-1054菌株的三条染色体上必需基因的异常分布[A];中国的遗传学研究——遗传学进步推动中国西部经济与社会发展——2011年中国遗传学会大会论文摘要汇编[C];2011年
相关博士学位论文 前2条
1 叶远浓;细菌必需基因团簇模型及最小基因集构建[D];电子科技大学;2015年
2 林岩;微生物必需基因数据的分析[D];天津大学;2010年
相关硕士学位论文 前2条
1 林丹;多种微生物功能基因的预测和分析[D];电子科技大学;2014年
2 窦运涛;原核生物基因识别程序ZCURVE 1.02的研发和微生物必需基因的分析[D];天津大学;2005年
,本文编号:2115898
本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/2115898.html