遗传疾病突变的数据挖掘分析
[Abstract]:Because of the development of technology and the reduction of cost, genome sequencing has been applied in Mendelian genetic diseases, complex diseases, and cancer gene detection, and produced a large amount of sequencing data. These data are important for studying the pathogenesis, clinical diagnosis and individualized treatment of disease. The molecular pathogenesis of more than 4000 human genetic diseases is unclear. Studies have shown that the mechanism of genetic diseases is closely related to variable splicing, splicing site is one of the important regulatory elements of variable splicing mechanism. It is very important to study the pathogenesis of genetic diseases at splicing site level. In order to solve this problem, sequence pattern mining model is used to study the mutation of splicing sites in genetic diseases. Cancer is the greatest threat to human health. The identification of potential proto-oncogenes and tumor suppressor genes can not only improve our understanding of tumorigenesis and cancer progression, but also contribute to the development of personalized cancer therapy. Genome sequencing studies over the past few years have produced a lot of data on somatic mutations in cancer, but how to interpret this sequence information remains a huge challenge in previous studies. Many methods have been developed to identify the driving genes according to the function of the genes that carry the mutations. Although some computing tools are available to predict the functional impact of mutations, their role is limited. The common mutation of genetic disease and cancer somatic cell establishes the molecular mechanism that affects the function of protein. We assume that these genes shared the same mutation are cancer driving genes. We use overlapping mutations of genetic diseases and cancer somatic mutations to identify potential new types of cancer driven mutations. The main work of this paper is as follows: (1) the sequence pattern mining model is used to study the mutation of splicing site region in genetic diseases. The sequential pattern mining model used in this paper is a fusion model of frequent pattern mining algorithm and PSSM algorithm. The experimental results show that the model has a good classification effect in distinguishing genetic disease mutation from common mutation. The signal of splicing site is weakened by the variation of splicing site region of genetic diseases, which leads to the destruction of normal splicing and the occurrence of disease. (2) Identification of cancer proto-oncogene and tumor suppressor gene by genetic disease mutation. In this study, we identified potential oncogenes and tumor suppressor genes using overlapping mutations of Mendelian disease and somatic mutations. Since genetic disease mutations and somatic mutations share mutations have clear molecular mechanisms that affect the function of proteins, we assume that these mutations are more likely to be cancer-driven mutations. Our studies have shown that superposition mutations of cancer somatic mutations and genetic disease pathogenic mutations are more frequently mutated in cancer and are enriched in known cancer genes. We identify potential tumor suppressor genes according to the number of overlapped mutations. The results show that ion channels, collagen and Marfan syndrome related genes may be a new classification of tumor suppressor genes. Then in each specific cancer type we identify potential proto-oncogenes based on high recurrence rates and overlapping mutations that are mutually exclusive to oncogene mutations. In conclusion, our research suggests that new cancer genes can be found from a large number of cancer genome sequencing data using overlapping mutations of genetic disease and cancer somatic mutations.
【学位授予单位】:安徽大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:R596;TP311.13
【相似文献】
相关期刊论文 前10条
1 舒音;;《人类遗传与遗传疾病》和作者[J];江苏中医杂志;1980年01期
2 马沛然,谈玉贞;关于遗传疾病问题[J];山东医药;1981年06期
3 周树舜,黄希顺,杨惠明,金启建;成都市六岁以下儿童神经遗传疾病的调查[J];中国神经精神疾病杂志;1984年01期
4 ;内分泌、免疫、遗传疾病[J];中国医学文摘.内科学;1987年09期
5 ;内分泌、免疫和遗传疾病[J];中国医学文摘.内科学;1988年06期
6 袁波;;一种新型遗传疾病[J];国外医学情报;1990年23期
7 ;内分泌和遗传疾病[J];中国医学文摘.内科学;1992年06期
8 巩洋;;用试验方法鉴定可能把遗传疾病传给子女的父母[J];国外医学情报;1993年19期
9 ;内分泌和遗传疾病[J];中国医学文摘.内科学;1993年02期
10 ;内分泌和遗传疾病[J];中国医学文摘.内科学;1994年03期
相关会议论文 前2条
1 徐亚杰;熊家军;杨利国;张淑君;;奶牛遗传疾病症状和发病机理[A];中国奶业协会第26次繁殖学术年会暨国家肉牛牦牛/奶牛产业技术体系第3届全国牛病防治学术研讨会论文集[C];2011年
2 孙东晓;张沅;张胜利;初芹;李艳华;孙艺;杨鸣洲;张松;张毅;;奶牛常见遗传病的遗传基础和检测方法[A];中国奶业协会年会论文集2009(上册)[C];2009年
相关重要报纸文章 前10条
1 记者 刘霞;研究发现所有遗传疾病的基因有同一“祖先”[N];科技日报;2008年
2 记者 刘石磊;“一父两母”技术助阻遗传疾病[N];新华每日电讯;2013年
3 陈丹;分析全家福照片可识别遗传疾病[N];科技日报;2014年
4 记者 耿倩;四百多种遗传疾病一检便知[N];科学导报;2014年
5 刘霞;科学家成功操控蛋白质制造中的信号阅读[N];科技日报;2011年
6 张建松;国际医学界首次以中国人姓氏命名遗传疾病[N];中国中医药报;2001年
7 记者 毛磊;其目的专家质疑[N];新华每日电讯;2002年
8 王雪梅;有些农药“中毒”,,可能会遗传四代[N];新华每日电讯;2005年
9 ;基因组中可能的致病基因[N];中国高新技术产业导报;2002年
10 戴旬;科学家研究用人造染色体治疗遗传疾病[N];大众科技报;2004年
相关博士学位论文 前1条
1 张琪;基于二代测序技术的视网膜遗传疾病分子诊断研究[D];浙江大学;2016年
相关硕士学位论文 前1条
1 王畅畅;遗传疾病突变的数据挖掘分析[D];安徽大学;2017年
本文编号:2231678
本文链接:https://www.wllwen.com/yixuelunwen/nfm/2231678.html