混合线性模型方法探索复杂性状的遗传结构及其软件开发
本文关键词: 复杂性状 作物种子性状 混合线性模型 连锁分析 全基因组关联分析 上位性 基因与环境互作效应 出处:《浙江大学》2016年博士论文 论文类型:学位论文
【摘要】:农作物种子作为人类食物、动物饲料和工业原材料的重要来源,主要由二倍体的胚胎和三倍体的胚乳组成。大多数农艺性状,包括种子性状都是复杂性状,它不仅仅受到单个基因控制,同时受到上位性效应和基因与环境互作效应的影响。随着高通量测序技术的发展,全基因组关联分析方法是检测人类疾病和农业复杂性状遗传变异的一种有效手段。但是目前大多数关联分析方法均是基于简单的加性模型,且仅对单个数量性状进行分析。针对作物种子性状连锁分析和多性状全基因组关联分析存在的问题,我们基于混合线性模型发展了对应的新方法剖析复杂性状的遗传结构。蒙特卡洛模拟和实例数据分析均证明了新方法的无偏性和可靠性。本论文主要内容包括以下三个章节:第一章首先介绍近年来连锁分析和关联分析的研究进展以及发展的相应软件。此外,我们还介绍了在混合线性模型中,假设检验和参数估计常用的统计方法。第二章介绍了新发展的基于混合线性模型种子性状定位的试验设计和统计方法。开花植物的种子来源于双受精,不但在繁衍中发挥重要作用,而且还是动物饲料和人类食物的主要来源。种子的发育包含多个遗传体系,比如母体基因组、胚基因组和胚乳基因组。由于其复杂的多遗传体系,尤其是来自同一基因组内和来自不同基因组间的上位性以及各项遗传分量与环境互作效应的存在,使得研究种子性状的遗传机制面临巨大的挑战。根据种子性状的遗传特征,我们提出了两个统计遗传模型,该模型中包含母体加性和显性效应,胚或胚乳的加性和显性效应,母体基因组内的加加上位性效应,胚或胚乳基因组内的加加上位性效应,母体和胚或胚乳基因组间的加加上位性效应以及这些效应与环境的互作效应。遗传作图群体可以由永久F2随机交配产生,或是由永久F2与双亲的双向回交产生,或是由永久F2群体自交产生。模特卡洛模拟验证了不同的遗传率和不同的模型对参数估计的影响。棉花种子性状的实例分析也验证了方法的可靠性。基于提出的方法,我们开发了QTLNetwork-Seed-1.0.exe软件,用于种子性状的定位分析。第三章介绍了新发展的基于混合线性模型的多性状全基因组关联分析方法和统计软件。随着高通量测序技术的发展,全基因组关联分析已经变成了广泛使用的探索复杂性状遗传结构的新方法。但是关联分析中主要存在的问题是个体和位点之间的关联会造成假阳性,而混合线性模型是一种有效的控制群体结构的方法。此外,大多数复杂疾病综合症状包含一系列高度关联的临床或分子表现型,因此应该把这些性状联合起来分析检测影响多个性状共有的遗传变异。而目前的大多数方法都是基于加性效应的单性状关联分析的模型。因此,我们拓展了多变量混合线性模型,其中包含了上位性和基因与环境互作效应。我们提出的新方法不但能检测多效性基因,同时还能检测性状特异表达的基因。大量的模拟研究调查了不同的残差相关系数,不同的遗传率以及不同的模型对定位功效和效应估计精度的影响。水稻实例数据也证明了方法的有效性。基于提出的方法,我们开发了相应的软件JAMT (Joint Analysis for Multiple Traits),用于多性状联合关联分析。
[Abstract]:Seed is an important source of human food, animal feed and industrial raw materials, mainly by the diploid embryo and the triploid endosperm. Most agronomic traits, including seed traits are complex traits, it is not only a single gene control, is also affected by the epistatic effects and the interaction between gene and environment. With the development of high the amount of sequencing technology, genome-wide association analysis method is the detection of human disease and agricultural complex genetic variation characteristics of an effective means. But most of the current relevance analysis method is based on the simple additive model, and only on the number of single trait were analyzed. According to the analysis of crop seed trait linkage analysis and genome-wide association traits the problems of our genetic structure of mixed linear model to develop a new method corresponding to the analysis of complex traits based on Monte Carlo model. Unbiasedness and reliability to data analysis and examples show that the new method. The main contents of this thesis include the following three chapters: the first chapter introduces the corresponding software in recent years, linkage analysis and association analysis of the progress of research and development. In addition, we also introduced in the mixed linear model, statistical methods commonly used assumptions test and parameter estimation. The second chapter introduces the new development of the experimental design and statistical method of mixed linear model of seed traits based on location. Flowering plants derived from seeds of double fertilization, not only play an important role in reproduction, and the main source and animal feed and human food. The development of the seed contains more than one genetic system. For example, the maternal genome, embryo and endosperm genome genome. Because of its complex genetic system, especially from the same genome and from among different genomes And the genetic components and environment interaction effect exists, the genetic mechanism of seed traits is facing enormous challenges. According to the genetic characteristics of seed traits, we proposed two statistical genetic model, the model contains maternal additive and dominant effect, additive and dominant effects of embryo and endosperm, the effect of Gaga the maternal genome of the embryo and endosperm in the genome of the epistatic effect, the interaction effects of maternal and embryo and endosperm genomes between epistatic effects and these effects and environment. The genetic mapping population can be made permanent F2 random mating, or is produced by a two-way backcross and parents or permanent F2. F2 is produced by permanent inbreeding models. Carlo simulation verifies the effect of the heritability of different models for parameter estimation. Examples analysis of cotton seed traits also verified the method Reliability. The proposed method is based on, we developed QTLNetwork-Seed-1.0.exe software for analysis of location of seed traits. The third chapter introduces the new development of the multi trait mixed linear model of genome-wide association analysis and statistical software. With the development of high-throughput sequencing technology, genome-wide association analysis has become a new method to explore the complex genetic structure characters of widely used. But the main problems in the correlation analysis there is correlation between the individual and the site will cause false positives, and the mixed linear model is an effective method of control group structure. In addition, most of the complex disease syndrome contains a series of highly related clinical or molecular type, so we should take combined detection of multiple traits in common genetic variation in these traits. Most current methods are based on additive Analysis of single trait correlation model. Therefore, we expand the multivariate mixed linear model, including epistasis and gene environment interaction effects. Our proposed method can not only detect the pleiotropic gene, but also detect the expression characters of specific genes. Simulation study on large amount of investigation of residuals of different correlation coefficients the heritability of different model and different influence on the estimation accuracy of positioning function and effect of rice. Examples data also show the effectiveness of the method. The proposed method is based on, we developed the corresponding software JAMT (Joint Analysis for Multiple Traits), for the analysis of multi trait association.
【学位授予单位】:浙江大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:TP311.52;Q348
【相似文献】
相关期刊论文 前10条
1 周永正;混合线性模型联合估计的一个注记[J];数学的实践与认识;2002年06期
2 石磊,向黎明,,王学仁;混合线性模型效应参数的影响分析[J];数学物理学报;1996年03期
3 朱军;运用混合线性模型定位复杂数量性状基因的方法[J];浙江大学学报(自然科学版);1999年03期
4 汪咬元;有约束条件时混合线性模型的最优估计量公式[J];数学杂志;1986年04期
5 周永正;;一般混合线性模型固定效应、随机效应与另一随机向量的联合估计[J];数学的实践与认识;2011年19期
6 石磊,张宝华,雷森;混合线性模型效应参数的Bayes影响分析[J];云南大学学报(自然科学版);1999年06期
7 王胜初,阎新甫;混合线性模型生成模拟数据的方法和软件设计[J];浙江农业大学学报;1998年02期
8 石磊,李兴绪,周汝良,雷森;混合线性模型效应参数的Bayes局部影响分析[J];数学物理学报;2002年04期
9 胡希远;利用SASPROC MIXED分析混合线性模型非平衡试验数据[J];数理统计与管理;2005年01期
10 王爱国,D Laloe,LR Schaeffer CGIL;混合线性模型下猪群间遗传联系的度量[J];遗传;2000年05期
相关会议论文 前2条
1 周永正;;混合线性模型中β与δ的同时估计[A];中国现场统计研究会第九届学术年会论文集[C];1999年
2 童春发;施季森;李力;;一般遗传模型的方差分析和协方差分析[A];持续发展,再创辉煌——中国林学会林木遗传育种分会第五届年会文集[C];2002年
相关博士学位论文 前3条
1 魏巨龙;混合线性模型解析数量性状遗传基础的研究[D];中国农业大学;2016年
2 祁婷;混合线性模型方法探索复杂性状的遗传结构及其软件开发[D];浙江大学;2016年
3 尤萨夫;基于混合线性模型进行遗传数据分析的异常值检测方法[D];浙江大学;2008年
本文编号:1514854
本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/1514854.html