基于构建基因相互作用网络的新方法挖掘肺腺癌激活的信号通路
本文选题:肺腺癌 + 基因相互作用 ; 参考:《山东大学》2016年博士论文
【摘要】:肺癌已成为发达及发展中国家和地区中癌症患者死亡的最主要原因,而非小细胞肺癌发病率占据肺癌的80%。肺腺癌是一种主要的非小细胞肺癌类型,其发病于细支气管或肺泡上皮细胞,血运丰富,具有典型的周边转移性,其死亡率约占肺癌死亡率的50%。生物信息学是整合了信息学、统计学和计算机学等多种技术分析海量生物数据所包含的信息的一门交叉学科。随着生物信息学的发展,形成了新的生物学研究模式,即利用现有的数据信息,先作理论推测,再行实验验证。从分子水平研究疾病的发生和发展,进一步指导疾病的预防、诊断和治疗。本课题以ArrayExpress数据库为研究基础,筛选肺腺癌患者与正常对照样本之间的差异表达基因(DEG),联合相互作用基因/蛋白检索工具(STRING)数据库、差异共表达的基因和边方法(DCGL)、经验贝叶斯法(EB)以及加权基因共表达网络分析法(WGCNA)对差异基因之间相互作用关系进行研究,从而提出构建基因相互作用网络的新方法。基于表达分析系统检测算法(EASE)测试,对异常表达基因进行京都基因与基因组百科全书(KEGG)通路分析。利用合并的新方法和置换检验挖掘肺腺癌不同时期(ⅠA、ⅠB、ⅡA、ⅡB、Ⅲ A、 ⅢB和Ⅳ期)激活的通路,为更好地理解肺腺癌的发病机制提供潜在分子标志和重要信息,为进一步研究肺腺癌的发生和发展、诊断及治疗提供新的方向。第一部分构建肺腺癌基因相互作用网络的新方法背景:肺癌死亡率占据恶性肿瘤死亡率首位,肺腺癌是其主要的病理类型之一,肺腺癌发病率高,五年生存率低,目前靶向药物治疗越来越多地应用到肺腺癌的治疗中。近年应用网络进行癌症基因表达研究逐渐成为热点,但仅在差异表达基因层面不足以探究疾病的发病机制。基因间相互作用关系对基因的表达具有很大的影响,全面认识基因间直接和间接的相互作用关系对全面描述细胞机制和功能具有重要的意义。此外,不同的基因相互作用网络研究方法会导致基因间相互作用关系研究结果的不一致。目的:本研究通过对肺腺癌基因表达谱分析,筛选与该肿瘤相关的差异表达基因,建立构建基因相互作用网络的新方法,获取可能与肺腺癌关系密切的相关基因和信号通路,为进一步研究肺腺癌的分子机制提供理论基础。方法:从ArrayExpress数据库中下载与肺腺癌相关的基因表达谱数据,使用RankProd包筛选差异表达基因。联合STRING数据库、DCGL法、EB法和WGCNA方法构建差异基因之间相互作用关系网络,将四种网络基因相互作用表达值合并转换后形成新的基因相互作用表达值,根据新的表达值构建了一个新的基因相互作用网络,从而创造了一种将现有的方法结合起来的新的算法,称之为合并的方法。之后对五种网络进行了拓扑特性分析;并根据合并的网络获取中心节点基因。对五种方法筛选的基因对进行通路富集分析。结果:(1)本研究筛选出941个肺腺癌差异表达基因,其中上调基因386个,下调基因555个。(2)根据已有的四种方法分别构建了基因相互作用网络,并成功建立了构建基因相互作用网络的新方法。(3)对比分析了STRING、DCGL、 EB、WGCNA和合并的方法构建的基因相互作用网络的拓扑学特性,结果表明,五种网络的节点度数经曲线拟合后的拟合系数分别是0.931、0.938、0.963、0.264和0.977,而平均最短路径长度分别为5.337、2.715、3.673、1.783和4.195。WGCNA方法创建的网络拥有最短的平均路径长度,而合并的网络具有最高的拟合系数。(4)根据合并的网络,筛选出15个中心节点基因,其中TOP2A、PAICS、BUB1、 ADAM12、FGB、NONO、UGT8、SRPX2、AOC1、AURKA、NCAPG、RACGAP1属于上调基因,IL1RL1、TAC1、DARC属于下调基因。(5)对941个差异表达基因进行通路富集分析得到7条显著富集的通路:细胞外基质受体相互作用、细胞粘附分子、p53信号通路、粘着斑、血管平滑肌收缩、细胞周期和补体系统,通过合并的方法得到的基因对主要富集在细胞周期通路和P53通路,而五种方法中得到的共同的基因对富集的通路是细胞周期。结论:(1)成功筛选出肺腺癌中941个差异表达基因,并对其进行通路富集分析,为更好地理解肺腺癌的分子机制提供理论依据。(2)成功建立了构建基因相互作用网络的新方法,网络拓扑学分析结果显示,该方法构建的网络拥有更明显的无标度网络特征,具有较高的稳健性,它能够提供一个更可靠并可行的结果,具有广阔的应用前景;WGCNA方法创建的网络更具有小世界网络性质,能够实现信息的快速集成。(3)筛选出15个中心节点基因,他们可能与肺腺癌的发病关系密切,为进一步研究肺腺癌的发病机制及治疗提供研究方向。(4)差异基因通路富集分析证明细胞周期通路与肺腺癌发病机制有密切关系。第二部分肺腺癌发展过程中激活通路的研究背景:近年来,肺腺癌的发生率和死亡率不断提高,大部分患者的治疗和预后依旧非常差。肺癌相关的基因表达谱研究、信号通路研究和靶向治疗受到越来越多的关注,基于网络的信号通路筛选和分类方法逐渐成熟,在疾病中,信号通路存在激活与未激活两种状态,激活的通路在疾病中更积极地发挥着作用,未激活的通路只是存在于疾病中,可能与此病的发病机制没有直接的关系。因此研究癌症中激活的通路对癌症的治疗和预防有重要的意义。目的:利用基因相互作用网络分析及通路活性分析,挖掘肺腺癌发展进程中激活的信号通路,为肺腺癌的诊断及治疗提供分子标志。方法:从ArrayExpress数据库中下载与肺腺癌不同时期相关的基因表达谱数据,使用RankProd包筛选差异表达基因。对差异表达基因进行KEGG通路富集分析。以差异表达基因为基础,利用第一部分中得到的新方法构建肺腺癌不同时期(ⅠA, ⅠB, IⅡA,ⅡB,Ⅲ A,ⅢB和Ⅳ期)基因相互作用网络,结合置换检验方法识别富集的通路在肺腺癌的不同时期是否被激活。结果:(1)本研究筛选出了211个肺腺癌的差异表达基因。(2)在肺腺癌发展过程的七个不同时期中,基因之间相互作用关系的多少没有明显的变化规律。(3) KEGG通路富集分析发现了肺腺癌不同时期的10条共同富集的信号通路,分别是:细胞周期、黄体酮调节的卵母细胞成熟、卵母细胞减数分裂、细胞外基质受体相互作用、血管平滑肌收缩、刺激神经组织的配体-受体互作、癌症通路、p53信号通路、肾素血管紧张素系统和肾细胞癌。(4)通路活性分析结果显示,细胞周期、黄体酮调节的卵母细胞成熟和卵母细胞减数分裂在肺腺癌疾病的各个时期都被激活。同时,p53信号通路和癌症通路在大部分时期都是被激活的,除了ⅢA期;但肾素血管紧张素系统通路在各时期都未被激活。结论:(1)根据肺腺癌差异表达基因,我们发现了10条肺腺癌不同时期共同富集的通路。(2)我们成功挖掘了肺腺癌进程中三条共同激活的通路:细胞周期、黄体酮调节的卵母细胞成熟和卵母细胞减数分裂通路,这些通路或许是肺腺癌诊断和治疗的潜在标记。
[Abstract]:Lung cancer has become the most important cause of cancer deaths in developed and developing countries and regions. 80%. lung adenocarcinoma, which is not the incidence of small cell lung cancer, is a major non small cell lung cancer, which occurs in bronchioles or alveolar epithelial cells, rich in blood, and with typical peripheral metastases, with a mortality rate of about 50%. bioinformatics of lung cancer mortality is an interdisciplinary subject that integrates Informatics, statistics and computer science to analyze the information contained in mass biological data. With the development of bioinformatics, a new model of biological research has been formed, that is, using the existing data information, first to make theoretical speculation, and then to test it by experiment. Study the occurrence and development of disease from the molecular level and further guide the prevention, diagnosis and treatment of the disease. This subject uses the ArrayExpress database as the basis to screen the differential expression genes (DEG) between the lung adenocarcinoma patients and the normal control samples, and the joint interaction gene / protein retrieval tool (STRING) database. Gene and edge method (DCGL), empirical Bayesian method (EB) and weighted gene co expression network analysis (WGCNA) are used to study the interaction relationship between different genes, and a new method for constructing the gene interaction network is proposed. Based on the expression analysis system detection algorithm (EASE) test, the abnormally expressed genes are in Kyoto gene and base. In order to further study the pathogenesis and development of lung adenocarcinoma and to further study the pathogenesis and development of lung adenocarcinoma, the new method and replacement test of the combined KEGG pathway are used to excavate the pathways activated in the different periods of lung adenocarcinoma (I A, I B, II A, II B, III A, III B and IV). The first part constructs a new method background for the construction of lung adenocarcinoma gene interaction network: lung cancer mortality takes the first place in the mortality of malignant tumors, lung adenocarcinoma is one of the main pathological types, the incidence of lung adenocarcinoma is high, and the five year survival rate is low. At present, targeted drug therapy is more and more applied to the treatment of lung adenocarcinoma. In recent years, the research of gene expression of cancer has become a hot spot in the application of network, but only at the level of differentially expressed genes is not sufficient to explore the pathogenesis of the disease. The INTERGENE interaction has a great influence on the expression of genes. The comprehensive understanding of the direct and indirect interactions between genes is a comprehensive description of the mechanism and function of the cells. It is of great significance. In addition, different methods of gene interaction network can lead to inconsistencies in the results of INTERGENE interaction. Objective: This study screened the differentially expressed genes related to the tumor by analyzing the gene expression profiles of lung adenocarcinoma, and established a new method for constructing the construction gene interaction network. The related genes and signaling pathways that are closely related to lung adenocarcinoma provide a theoretical basis for further study of the molecular mechanisms of lung adenocarcinoma. Methods: downloading gene expression profiles associated with lung adenocarcinoma from the ArrayExpress database and using RankProd packets to screen differentially expressed genes. Combined with STRING database, DCGL, EB and WGCNA methods. Build a new gene interaction network based on the new expression values, and create a new algorithm to combine the existing methods, which is called the combination of the existing methods. After the analysis of the topology characteristics of the five kinds of networks, the central node gene was obtained by the combined network. The gene pairs screened by the five methods were enriched and analyzed. Results: (1) 941 differentially expressed genes of lung adenocarcinoma were screened in this study, including 386 up-regulated and 555 down-regulated genes. (2) according to the existing four formulas. The gene interaction network was constructed respectively, and a new method for constructing the gene interaction network was successfully established. (3) the topological properties of the gene interaction network constructed by STRING, DCGL, EB, WGCNA and the combined method were compared and analyzed. The results showed that the fitting coefficients of the node degrees of the five networks were 0.93 after the curve fitting. 1,0.938,0.963,0.264 and 0.977, and the average shortest path length of the 5.337,2.715,3.673,1.783 and 4.195.WGCNA methods has the shortest average path length, and the merged network has the highest fitting coefficient. (4) according to the merged network, 15 middle heart node genes are screened, including TOP2A, PAICS, BUB1, ADAM12, FGB, NONO. UGT8, SRPX2, AOC1, AURKA, NCAPG, RACGAP1 belong to up-regulated genes, IL1RL1, TAC1, and DARC belong to the down regulated genes. (5) the pathway enrichment analysis of 941 differentially expressed genes obtained 7 significant pathways: extracellular matrix receptor interaction, cell adhesion molecules, p53 signaling pathways, adhesion spots, vascular smooth muscle contraction, cell cycle and supplement. The gene pairs obtained by the combined method are mainly enriched in the cell cycle pathway and the P53 pathway, and the common gene pairs in the five methods are cell cycles. Conclusion: (1) the 941 differentially expressed genes in the lung adenocarcinoma were successfully screened and the pathway was enriched and analyzed to better understand the lung adenocarcinoma. The molecular mechanism provides a theoretical basis. (2) a new method for constructing the gene interaction network has been successfully established. The network topology analysis results show that the network constructed by this method has more obvious characteristics of the scale-free network, and has high robustness. It can provide a more reliable and feasible result, and has a broad application prospect; WGCNA The network created by the method has the nature of small world network and can realize the rapid integration of information. (3) screening out 15 central node genes, they may be closely related to the pathogenesis of lung adenocarcinoma, and provide research direction for further research on pathogenesis and treatment of lung adenocarcinoma. (4) differential gene pathway enrichment analysis proves cell cycle pathway and The pathogenesis of lung adenocarcinoma is closely related. Second the research background of the activation pathway in the development of lung adenocarcinoma: the incidence and mortality of lung adenocarcinoma have been improved in recent years, and the treatment and prognosis of most patients are still very poor. The study of gene expression profiles related to lung cancer, the research of signal pathway and target therapy are more and more. It is concerned that the network based signaling pathway screening and classification methods are gradually mature. In the disease, there are two states of activation and inactivation in the signal pathway. The activated pathway plays a more active role in the disease. The inactivated pathway is only in the disease, and it may not be directly related to the pathogenesis of this disease. The activated pathway is of great significance for the treatment and prevention of cancer. Objective: to use gene interaction network analysis and pathway activity analysis to excavate activated signal pathways in the development of lung adenocarcinoma, and to provide molecular markers for the diagnosis and treatment of lung adenocarcinoma. Method: downloading different periods from ArrayExpress database with lung adenocarcinoma. Related gene expression profiles, using RankProd packets to screen differentially expressed genes. KEGG pathway enrichment and analysis of differentially expressed genes. Based on differentially expressed genes, the gene interaction network of lung adenocarcinoma (I A, I B, I II A, I II A, II B, III A, III B and IV phase) was constructed on the basis of differentially expressed genes. The results were as follows: (1) 211 differentially expressed genes of lung adenocarcinoma were screened in this study. (2) there was no obvious change in the relationship between genes in seven different stages of the development of lung adenocarcinoma. (3) enrichment and analysis of KEGG pathway. 10 common signaling pathways in different stages of lung adenocarcinoma are presented: cell cycle, maturation of oocytes regulated by progesterone, meiosis of oocyte, interaction of extracellular matrix receptor, contraction of vascular smooth muscle, ligand receptor interaction of nerve tissue, cancer pathway, p53 signaling pathway, renin angiotensin System and renal cell carcinoma (4) pathway activity analysis showed that cell cycle, progesterone regulated oocyte maturation and oocyte meiosis were activated at all stages of lung adenocarcinoma, while p53 signaling pathways and cancer pathways were stimulated in most stages, except for stage III A; but renin angiotensin system The pathway was not activated at all times. Conclusion: (1) according to the differentially expressed genes of lung adenocarcinoma, we found 10 common enrichment pathways in different stages of lung adenocarcinoma. (2) we successfully excavated three common pathways in the process of lung adenocarcinoma: cell cycle, progesterone regulated oocyte maturation and oocyte meiosis pathway, These pathways may be potential markers for diagnosis and treatment of lung adenocarcinoma.
【学位授予单位】:山东大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:R734.2
【参考文献】
相关期刊论文 前10条
1 张丽;李蒙;吴宁;孙巍;吕律;林冬梅;;临床Ⅰ期浸润肺腺癌不同组织学亚型的三维CT值定量分析[J];中华放射学杂志;2015年04期
2 窦雪琳;白春梅;;癌症生物标记物和个体化医疗[J];中国医学科学院学报;2015年01期
3 张杨;张威;曹文君;李运明;李宁霞;陈长生;;肿瘤表达谱基因芯片数据的混合效应模型分析[J];现代生物医学进展;2015年03期
4 董良;李海金;;晚期非小细胞肺癌维持治疗进展[J];中国新药与临床杂志;2015年01期
5 刘天舒;毛志福;刘军韬;汪巍;毛张凡;黄杰;耿庆;;Notch信号通路在肺癌干细胞中的表达及其对增殖的影响[J];中华实验外科杂志;2015年01期
6 周孝湖;任扬;林高波;徐海;;蛋白质相互作用抑制剂生物活性测定方法综述[J];山东化工;2014年08期
7 朱玉胜;;从“人类基因组计划”到“癌症基因组图谱计划”,你准备好了吗?[J];检验医学;2014年05期
8 廉政君;黄建国;;晚期非小细胞肺癌预后相关因素分析[J];现代肿瘤医学;2014年01期
9 冀俊忠;刘志军;刘红欣;刘椿年;;蛋白质相互作用网络功能模块检测的研究综述[J];自动化学报;2014年04期
10 李向真;刘子朋;李娟;方慧生;;KEGG数据库的进展及其在生物信息学中的应用[J];药物生物技术;2012年06期
相关博士学位论文 前4条
1 杨欣;肺腺癌预后相关的临床病理因素分析及分子病理研究[D];北京协和医学院;2014年
2 李亚光;PI3K-Akt通路与肺腺癌术后患者预后的相关性研究[D];北京协和医学院;2012年
3 王桂平;基于基因表达谱的肺腺癌治疗药物筛选及相关实验研究[D];南方医科大学;2010年
4 陈启龙;非小细胞肺癌发生分子机制的生物信息学研究[D];上海大学;2009年
相关硕士学位论文 前2条
1 王欢;小细胞肺癌化疗耐药microRNAs的分析及预测研究[D];中国人民解放军医学院;2013年
2 陈靖祺;整合基因共表达网络和代谢网络预测新癌症靶点及潜在抗癌药物[D];复旦大学;2011年
,本文编号:1810667
本文链接:https://www.wllwen.com/yixuelunwen/zlx/1810667.html