结核分枝杆菌基因组重注释研究
[Abstract]:Tuberculosis causes serious harm to the health of people all over the world every year. Among them, Mycobacterium tuberculosis (Mycobacterium tuberculosis) is the pathogen of tuberculosis. Although much progress has been made in the study of the genomics of Mycobacterium tuberculosis, the annotated information on the entire genome of Mycobacterium tuberculosis is available in the genome public database, but over time, More and more new genetic functional information has been added to the database, which may contain sequence-like genes that were not used in the initial annotation of Mycobacterium tuberculosis. In genome analysis, these new gene function information may provide a functional transfer source for some hypothetical genes. At the same time, some genes not contained in the original annotation may be found by comparing with the newly added gene function information. In order to solve the above problems, we will reannotate the genome information of Mycobacterium tuberculosis by means of gene similarity comparison and new gene discovery based on ab initio prediction. The method of this study can be used as a reference for genome reannotation of other species. The main contents of this study are: 1: 1. Based on the Z curve theory, a protein encoding gene (the first type gene) with known function is selected from the original gene annotation as a positive sample, and a negative sample is generated by random shuffling sequence of the first type gene. Taking positive and negative samples as training set, the non-coding part of hypothetical gene (second type gene) is determined by Fisher model based on quintuple cross validation, that is, the wrong annotated gene. 2 in the original annotation. Prodigal and Zcurve were used to predict the genome of Mycobacterium tuberculosis. The results of gene prediction were compared with the original genome annotation, and the candidate genes with low overlap rate were selected for Blast sequence alignment. The new genes that meet the conditions were selected by using the selected screening parameters, and specific functional annotation information was added to the new genes. In the process of gene reannotation, it is necessary for researchers to carry out manual screening. When there are a large number of genomes that need to be re-annotated, especially when new genes are selected from Blast results to meet the requirements, it will be a very heavy task. Therefore, this study also developed a set of Web tools which can automatically reannotate genome by using PHP, which can reduce the manual screening workload and improve the efficiency of gene reannotation greatly.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:R378.911
【相似文献】
相关期刊论文 前4条
1 徐永忠;对推算全基因组易位率公式的探讨[J];国外医学(放射医学核医学分册);1999年01期
2 王亚之;李秋实;陈士林;孙超;宋经元;;基于流式细胞分析技术的茯苓基因组大小测定[J];世界科学技术(中医药现代化);2010年03期
3 张阵阵;郭美丽;张军东;;红花基因组扩增片段长度多态性反应体系的建立和优化[J];第二军医大学学报;2006年03期
4 ;[J];;年期
相关会议论文 前3条
1 李秋实;徐江;朱英杰;孙超;宋经元;陈士林;;基于流式细胞术的赤芝基因组大小估测[A];第十一届全国青年药学工作者最新科研成果交流会论文集[C];2012年
2 张琳琳;李莉;许飞;亓海刚;王晓通;张国范;;长牡蛎基因组fosmid文库的构建及分析[A];中国动物学会、中国海洋湖沼学会贝类学会分会第十四次学会研讨会论文摘要汇编[C];2009年
3 陈晓丹;王永;卢军;朱利泉;王小佳;;芸薹属A基因组DNA封阻下的C染色体组核型分析[A];第九届西南三省一市生物化学与分子生物学学术交流会论文集[C];2008年
相关重要报纸文章 前10条
1 宗合;科学家破译木豆基因组将加速育种发展[N];粮油市场报;2011年
2 记者 夏静 通讯员 范敬群;我首个果树基因组序列图谱完成[N];光明日报;2012年
3 铁铮 记者 赵凤华;我科学家绘制出毛白杨基因组序列框架图[N];科技日报;2011年
4 仲亚;灵芝全基因组精细图谱发布[N];中国中医药报;2012年
5 记者 谭大跃 通讯员 王静思 梁艺染;中美科学家合作解码蚂蚁基因组[N];深圳特区报;2010年
6 记者 张聪;我国首发丹参基因组框架图[N];中国中医药报;2010年
7 记者 刘传书;我科学家绘出大熊猫“晶晶”基因组精细图[N];科技日报;2009年
8 记者 谭大跃 通讯员 逄莎莎;白菜全基因组研究成果发表[N];深圳特区报;2011年
9 记者 吴春燕;石斑鱼全基因组序列图谱绘制完成[N];光明日报;2011年
10 宋明辉;广东破解石斑鱼基因图谱[N];中国渔业报;2011年
相关博士学位论文 前10条
1 张丽敏;高梁基因组内大片段获得与缺失变异挖掘及其与重要农艺性状的关联分析[D];吉林大学;2013年
2 周正奎;全基因组关联分析和全基因组预测法解析犬髋关节疾病[D];西北农林科技大学;2011年
3 黄金龙;马属基因组和染色体快速进化的研究[D];内蒙古农业大学;2015年
4 曹月青;鉴定不同基因组之间差异序列的新方法研究[D];重庆大学;2005年
5 凌娜;节旋藻/螺旋藻基因组特性初探及硝酸盐转运蛋白基因克隆与序列分析[D];中国海洋大学;2006年
6 龚强;基因组变异的深度挖掘[D];中国科学院北京基因组研究所;2013年
7 欧z延,
本文编号:2193960
本文链接:https://www.wllwen.com/yixuelunwen/shiyanyixue/2193960.html