集成特征选择与基因调控网络构建研究
[Abstract]:With the rapid development of bio-information technology, the emergence of massive genome data into the post-genome era, researchers are no longer limited to the study of the function of a single gene, It is hoped that the complex life process of maintaining biological life can be understood from the point of view of system. Under this background, system biology has been developed rapidly. In the field of system biology, one of the challenges is the construction of gene regulatory networks, which graphically describe the interactions between genes. The construction of genetic regulatory networks through reverse engineering can help us to better understand the molecular mechanism that remains stable in organisms when environmental conditions fluctuate. With the development of DNA microarray technology, there are a lot of methods to construct gene regulation network with the rapid accumulation of gene expression data. In addition, gene sequence data and functional annotation data are also emerging. Different types of data often provide different information. How to make effective use of the complementarities of multiple data sources is very important for the accurate construction of gene regulatory networks. In view of the deficiency of using feature selection method to construct gene regulation network based on gene expression data, that is to say, the importance score of each potential edge of the network is often given. No appropriate threshold is determined to convert the sorting result into a network structure. This paper proposes an integrated feature importance genetic algorithm (Ensemble Feature Importance-Genetic Algorithm,EFI-GA), which combines integrated feature selection algorithm and genetic algorithm to construct gene regulation network. Firstly, the integrated feature selection method is used to calculate an importance score for each potential regulator of the target gene, which indicates the credibility of the real regulatory relationship between the regulatory gene and the target gene. Then the genetic algorithm is used to screen the optimal subset of regulators with high reliability. The experimental results on the data set of reverse engineering evaluation and method dialogue (Dialogue for Reverse Engineering Assessments and Methods,DREAM) show the effectiveness of the proposed method. In order to respond to external environmental stimulation or to complete a certain life process, transcription factors participate in the same life process by regulating the target genes to perform the corresponding functions, so they often have the same or similar functions. Considering the functional correlation between transcription factors and target genes will help to improve the accuracy of constructing regulatory networks. In this paper, a multi-feature fusion method for constructing gene regulation network based on fusion gene expression data, gene sequence data and gene ontology (Gene Ontology,GO) data is proposed. In order to effectively use the characteristics provided by different data sources to improve the accuracy of the construction of gene regulatory networks. Feature vectors are constructed from multiple data sources, and classification models are built by using support vector machines to predict the regulatory relationship between transcription factors and target genes. The cross-validation results on Arabidopsis and tomato datasets show that the proposed method has higher accuracy.
【学位授予单位】:大连理工大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:Q811.4;TP18
【相似文献】
相关期刊论文 前10条
1 张家军;蔡传政;王翼飞;;基因调控网络中的延滞动力学[J];应用科学学报;2007年01期
2 郭子龙;纪兆华;涂华伟;梁艳春;;基因调控网络的研究内容及其数据分析方法[J];电脑知识与技术;2008年15期
3 陈少白;罗嘉;;一类基因调控网络的定性分析[J];南京信息工程大学学报(自然科学版);2010年05期
4 李庆伟;全俊龙;刘欣;;基因调控网络研究进展[J];辽宁师范大学学报(自然科学版);2013年01期
5 叶纬明;吕彬彬;赵琛;狄增如;;少节点基因调控网络的控制[J];物理学报;2013年01期
6 王沛;吕金虎;;基因调控网络的控制:机遇与挑战[J];自动化学报;2013年12期
7 易东,李辉智;基因调控网络研究与数学模型的建立[J];中国现代医学杂志;2003年24期
8 雷耀山,史定华,王翼飞;基因调控网络的生物信息学研究[J];自然杂志;2004年01期
9 姜伟;李霞;郭政;李传星;王丽虹;饶绍奇;;时间延迟基因调控网络重构的决策树方法研究[J];中国科学(C辑:生命科学);2005年06期
10 张晗,宋满根,陈国强,骆建华;一种改进的多元回归估计基因调控网络的方法[J];上海交通大学学报;2005年02期
相关会议论文 前3条
1 熊江辉;李莹辉;;基因芯片数据分析的新方法与基因调控网络推理[A];全面建设小康社会:中国科技工作者的历史责任——中国科协2003年学术年会论文集(上)[C];2003年
2 王亚丽;周彤;;大规模基因调控网络因果关系的辨识[A];第二十九届中国控制会议论文集[C];2010年
3 冯晶;许勇;李娟娟;;非高斯噪声激励下基因调控网络的研究[A];第十四届全国非线性振动暨第十一届全国非线性动力学和运动稳定性学术会议摘要集与会议议程[C];2013年
相关重要报纸文章 前1条
1 吴佳s,
本文编号:2466104
本文链接:https://www.wllwen.com/kejilunwen/jiyingongcheng/2466104.html