当前位置:主页 > 硕博论文 > 医学博士论文 >

基于多组学数据的代谢综合征及代谢组分相关遗传变异的探索和解析

发布时间:2018-04-24 09:16

  本文选题:代谢综合征 + 全基因组关联分析 ; 参考:《浙江大学》2017年博士论文


【摘要】:代谢综合征是一组代谢指标异常症候群,主要表现为中心性肥胖、高血压和糖脂代谢紊乱。研究表明,代谢综合征会增加心脑血管疾病、糖尿病的发病风险。按中国糖尿病协会定义,2010年我国居民代谢综合征患病率达17.6%。双生子研究表明,遗传因素在代谢异常的发生中起了重要的作用。经典的全基因组关联研究(genome-wide association study,GWAS)已发现了一些与代谢异常相关的遗传变异,但仍存在以下两大问题:(1)已发现的位点仅能解释少量的遗传度,即存在大量丢失的遗传度(missing heritability)有待进一步探究;(2)以往研究主要关注表型相关的标签SNP(tag SNP)及与之距离最近的基因,但后续研究发现,大部分的定位信号并不通过与之距离最近的基因起作用,如现已证明,定位在FTO上的肥胖相关位点主要通过远程调控IRX3基因表达影响表型。总之,以往对GWAS定位信号的解析非常有限,多数关键基因和效应位点(causal variant)未被找到,不利于后续功能研究。近些年研究发现,大部分表型相关遗传变异通过调控基因表达水平进而影响表型。若以基因表达调控作为注释,可以将全基因组的筛选范围缩小到基因表达数量性状基因座(expression Quantitative Trait Loci,eQTLs)上,提高发现表型相关位点的效率。ENCODE、GTEx、Roadmap等研究项目陆续公布了大量可用于注释遗传变异对表型调控路径中多个组学的信息,包括具有组织特异性的基因表达、DNA甲基化、组蛋白修饰、转录因子结合等。这些信息可用于进一步精确筛选表型相关遗传变异,也可用于对已知信号的重新解析。以往的研究主要以欧裔人群为主,在我国汉族人群中的研究还很有限,由于不同人种间存在遗传背景的差异,欧裔人群的研究结果不能直接外推至中国汉族人群。为了系统地探索代谢综合征及代谢组分相关的遗传易感位点,本研究以我国汉族人群为主要研究对象,通过基于多组学注释的全基因组关联分析,发现并解析代谢异常相关的遗传变异。研究内容分为三个部分(图1):第一部分采用经典的全基因组关联分析策略,筛选与代谢综合征及代谢组分关联信号最强的遗传变异位点并进行多阶段验证、功能分析和基因环境交互效应分析。第二部分在第一部分研究的基础上,通过对转录调控的注释,进一步筛选与糖脂代谢相关的遗传变异,通过精细定位和功能实验,解析基本调控路径。第三部分针对已知的糖脂代谢关联信号,利用最新的多组学注释信息对信号区域进行系统化的重解析,探究关键的调控基因,为关联分析的成果转化提供支持。图1研究内容示意图左上侧图为课题组开展的代谢综合征GWAS关联信号(P值)的曼哈顿图,图下左侧为第一部分技术路线,右侧为第二部分技术路线。右侧图为NHGRI-EBI GWAS catalog记录的表型相关的位点,为研究的第三部分,采用多组学注释,对与代谢异常相关的位点进行系统化重解析。第一部分代谢综合征的遗传易感性研究一、研究目的:在中国汉族人群中筛选并验证代谢综合征相关的遗传易感性位点。二、材料和方法:以单核苷酸多态性位点为遗传变异标记,首先在杭州萧山地区1742例样本中采用全基因组关联分析的方法,筛选出与代谢综合征及代谢组分关联信号最强的位点,然后对这些位点进行多阶段独立样本验证,验证样本来自我国东部、北部、东北部等多个地区,共计10978例。合并多阶段验证结果后,对达到全基因组阳性水平的位点进行功能预测、基因环境交互效应分析。三、研究结果:通过代谢综合征的全基因组关联分析及多阶段独立样本的验证,研究发现位于APO45上的rs651821位点和位于ADF2上亚洲人特有的高频错义突变位点rs671的基因型与代谢综合征的遗传易感性相关。在控制了 APO4基因簇区域内最强信号rs651821后,位于BUD13上的rs180326位点仍然与血清甘油三酯(triglyceride,TG)水平相关(Pconbined=2.4E-08),是APO4基因簇内一个新的第二信号(secondary signal)。在整合了 遗传变异位点 rs651821、rsl80326、血清 APOA5、BUD13的蛋白水平、TG水平后分析发现,除了APO外,BUD13也参与了血清TG水平的调控。此外,研究发现rs671位点的多态性不仅会通过影响乙醛代谢影响人们的饮酒行为,该位点还与饮酒行为之间存在对代谢综合征及相关代谢表型的交互效应,其效应主要存在于饮酒人群中。四、小结:(1)在中国汉族人群中,rs651821(APOA5)和rs671(ALH2)位点的基因型与代谢综合征的遗传易感性相关;(2)rs180326(BUD13)基因型与血清TG水平相关,其效应独立于已知位点rs651821(APOA5);(3)rs671(ALDH2)的基因型与饮酒行为存在交互效应。第二部分基于注释信息的糖脂代谢相关遗传变异筛选及精细定位一、研究目的:在第一部分研究的基础上,结合多组学注释信息,进一步筛选和验证糖脂代谢指标相关的遗传变异位点;对新发现的糖脂代谢相关遗传变异位点进行精细定位,并通过功能实验确认效应位点。二、材料和方法:首先,通过对1742例样本(同第一部分)的糖脂代谢的全基因组关联分析,得到与表型相关的但又未被第一部分验证过的遗传变异位点;然后分析这些位点与脂肪、肝脏、胰岛和骨骼肌中基因表达水平的关联,并将这些位点的信号与表型关联信号共定位,找出与糖脂代谢表型相关的eQTLs;进一步通过多阶段独立样本验证基因型与表型的关联;对验证达到全基因组阳性水平的位点,通过以基因为单位的分析策略(gene-based analyses)推测其可能的调控基因;并通过ENCODE和Roadmap计划提供的相关组织细胞中染色质状态、组蛋白修饰、转录因子结合信号推测调控活性区域及其对应的效应位点;最后构建包含不同等位基因的载体,通过荧光素酶报告基因实验确认其表达调控作用。三、研究结果:通过全基因组关联分析及eQTL的注释,发现了 22个与糖脂代谢相关的遗传变异位点在特定的组织中影响基因表达,多阶段独立样本验证确认了 rsl880118位点与血清HDL-C水平的关联(Pcombined= 1.4E-10)。该位点可以tag的区域主要包括DAGLB和RAC1两个基因,在加性模型下,rsl880118的基因型可以解释DAGLB(diacylglycerol lipase,beta)基因在皮下脂肪组织中表达水平变异的47.7%(P =5.9E-42)。同时,通过TWAS、SMR、Sherlock等以基因为单位的研究方法,我们发现D4GLB基因与血清HDL-C水平之间存在关联,关联的P值分别为3.0E-08、1.1E-04和1.6E-06。进一步通过组蛋白信号H3K27ac、H3K4me3,H3K9ac及转录因子结合区域、DNA酶Ⅰ超敏位点的定位,找到了位于DAGLB基因5'区域的调控活性片段,荧光素酶报告基因的结果显示,该活性片段中rs4724806位点(与rs1880118位点LD r2 = 0.77)可能是真正的效应位点,其最小等位基因会增加转录活性,与eQTL分析的结果一致。四、小结:rsl880118是一个在中国汉族人群中新发现的与血清HDL-C水平相关的遗传变异位点,其效应位点rs4724806通过调控转录活性影响DAGLB基因表达,该部分研究提示了 A4GLB基因在脂代谢中的作用。第三部分利用多组学注释系统解析糖脂代谢相关的遗传变异一、研究目的:结合多组学注释信息,对已知的糖脂代谢相关遗传变异信号区域进行系统的解析,探究基因型到表型的调控路径,为后续功能研究提供支持。二、研究方法:首先,整理已报道的与糖脂代谢相关的遗传变异位点;然后,利用千人基因组计划提供的位点间连锁不平衡信息,填补出所有与lead SNP高度连锁不平衡的位点用于后续的精细定位和功能预测;接着,对这些位点进行系统的多组学注释,包括基因表达、染色质状态、DNA甲基化、组蛋白修饰、转录因子结合等。对位于编码区域的遗传变异位点,再进行物种间保守性估计、翻译后修饰等翻译水平的注释;,最后,整理推测出可能的基因型到表型调控路径。三、研究结果:对经筛选过滤后得到的592个糖脂代谢相关遗传变异位点进行填补后,共计17646个位点纳入后续精细定位分析(LD r20.5)。在转录水平,通过遗传变异与基因表达相关的注释,发现了 104个在内脏脂肪、肝脏、骨骼肌或胰岛细胞中与基因表达水平相关的位点。同时,发现了一些与特定环境刺激相关的eQTLs,如rs702485位点与DAGLB基因表达的关联仅出现在LPS刺激后(Pbefore0.05,Pafter=2.52E-16)。133个糖脂代谢相关位点与脂肪组织或胰岛细胞中的DNA甲基化相关,其中许多位点与多个CpG位点的甲基化水平相关。经过对遗传变异位点的精细定位,我们发现有49个位点可以关联(tag)到一个或以上的位于组蛋白修饰信号峰区域内的位点,且具有组织特异性。在翻译水平,有122个(r20.5)或43个(r20.8)位点可以关联到一个或以上的非同义突变位点,其中有16个位点经SIFT和Polyphen注释均提示存在影响蛋白功能的可能。对糖代谢相关位点rsl535500精细定位和功能预测后发现,该位点G到T的变异与附近7个CpG位点的存在相关联,这些位点靠近KCNK17基因5'端的CpG岛,存在影响DNA甲基化的可能。通过对胰岛细胞中多个组学的信息整合,发现该位点确实可以通过影响附近位点甲基化水平进而影响KCNK17的表达,这不同于以往报道认为该信号主要通过KCNK16起作用。四、小结:(1)通过整合多个组学信息,对糖脂代谢GWAS信号进行重新解析后发现了许多可能参与到遗传变异影响表型调控路径中的基因及调控元件。与其他复杂表型类似,经过注释后,三分之一的基因与以往对该位点报道一致。(2)仅有7%-20%的糖脂代谢表型相关遗传变异位点可以关联到一个或以上的非同义或无义突变,其余可能通过转录水平影响表型。(3)通过对糖代谢相关的位点rs1535500的精细定位,发现该位点G到T的变异与附近位点CpG位点的存在相关联,进而影响甲基化水平调控KCNK17的表达。结论:基于以上三部分内容,得出以下结论:(1)在中国汉族人群中,rs651821(APO45)和rs671(ALDH2)位点的基因型与代谢综合征的遗传易感性相关;新发现的rs180326(BUD13)位点与血清TG水平的关联独立于区域内已知位点rs651821;rs671与饮酒行为存在交互效应。(2)结合转录调控注释信息可优化GWAS的筛选策略;rs1880118是一个在中国汉族人群中新发现的与血清HDL-C水平相关的遗传变异位点,与该位点高度连锁不平衡的rs4724806位点多态性会通过调控DAGLB基因表达影响表型。(3)80%以上的糖脂代谢相关位点主要通过调控转录水平影响表型;经多组学注释推测的调控基因中有三分之一与原GWAS报道的基因一致,如文献报道糖代谢相关位点rs1535500可能的调控基因为KCNK16,但注释信息提示该位点可能通过影响甲基化水平调控KCNK17的表达进而影响表型。
[Abstract]:Metabolic syndrome is a group of abnormal metabolic syndrome syndrome, mainly characterized by central obesity, hypertension and glucose and lipid metabolism disorder. Studies have shown that metabolic syndrome can increase the risk of cardiovascular and cerebrovascular diseases and diabetes. According to the definition of China Diabetes Association, the prevalence rate of metabolic syndrome in Chinese residents in 2010 was 17.6%. twins. Genetic factors play an important role in the occurrence of metabolic abnormalities. The classical genome-wide association study (GWAS) has found some genetic variations associated with metabolic abnormalities, but there are still two major problems: (1) the found loci can only explain a small number of heritability, that is, there is a large number of lost heredity. Missing heritability remains to be further explored; (2) previous studies mainly focus on phenotypic related label SNP (tag SNP) and the closest genes to it, but subsequent studies have found that most of the location signals do not play a role in the nearest gene, for example, the obesity related loci on FTO have been mainly passed through. Long distance regulation of IRX3 gene expression affects phenotypes. In a word, the previous analysis of GWAS localization signals is very limited. Most of the key genes and effect sites (causal variant) have not been found, which is not conducive to the follow-up function study. In recent years, most of the phenotypic related genetic variations have passed the regulation gene expression level and then affect the phenotype. As a note of expression regulation, the screening range of the whole genome can be narrowed to the gene expression quantitative trait loci (expression Quantitative Trait Loci, eQTLs) to improve the efficiency of the detection of the phenotypic related sites,.ENCODE, GTEx, Roadmap and other research projects have been published in a large number of studies that can be used to annotate genetic variation in the pathway of phenotypic regulation. The information of multiple histones, including tissue specific gene expression, DNA methylation, histone modification, transcription factor binding, and so on. These information can be used to further accurately screen phenotypic related genetic variations and can be used to reinterpret known signals. Because of the differences in genetic background between different human species, the results of the European population can not be directly extrapolated to the Chinese Han population. In order to systematically explore the metabolic syndrome and the genetic susceptibility loci related to the metabolic components, the main object of this study is to study the whole base of the Chinese Han population based on the multi group annotation. The genetic variation related to metabolic abnormalities was found and resolved by group association analysis. The study was divided into three parts (Figure 1): in the first part, the classical whole genome association analysis strategy was used to screen the strongest genetic variation loci of metabolic syndrome and metabolic components and carry out multistage validation, functional analysis and genetic environment interaction. The second part, on the basis of the first part of the study, further screened the genetic variation related to glycolipid metabolism through the annotation of transcriptional regulation, and analyzed the basic regulation path through fine location and functional experiments. The third part aimed at the known glycolipid metabolic signals, using the latest multi group annotation information to the signal. The region carries out systematic reanalysis, explores key regulatory genes, and provides support for the transformation of correlation analysis. The upper left side map of Figure 1 is the Manhattan map of the metabolic syndrome associated signal (P value) carried out by the project group. The left side is the first part of the technical route and the right side is the second part of the technical route. The right side map is the right map. 1 The third part of the phenotypic related loci recorded by NHGRI-EBI GWAS catalog, which is the third part of the study, uses a multi - group annotation to systematically re - analyze the sites associated with metabolic disorders. Susceptibility loci. Two, materials and methods: using single nucleotide polymorphic loci as genetic variation markers, the best loci with metabolic syndrome and metabolic components were screened in 1742 samples of Xiaoshan, Hangzhou, with the method of full genome association analysis. The sample is from the eastern, northern, northeastern and other regions of China, with a total of 10978 cases. After the combined multistage validation results, the functional prediction of the positive levels of the whole genome, the analysis of the interaction effect of the gene environment. Three, the results of the study: through the whole genome association analysis of the metabolic syndrome and the verification of the multi stage independent samples, The study found that the genotype of the rs651821 site on APO45 and the genotype rs671 of the high frequency missense mutation site specific to Asians on ADF2 is related to the genetic susceptibility to metabolic syndrome. After controlling the strongest signal rs651821 in the APO4 gene cluster region, the rs180326 site on BUD13 is still with the serum triglyceride (triglyceride, TG) water. Flat correlation (Pconbined=2.4E-08) is a new second signal (secondary signal) in the APO4 gene cluster. After integrating the genetic variation loci rs651821, rsl80326, serum APOA5, BUD13 protein level and TG level after analysis, it is found that BUD13 also participates in the regulation of serum TG levels except APO. Furthermore, the polymorphism of the locus is found not. Only by influencing the metabolism of acetaldehyde affects people's drinking behavior, and the interaction effect on metabolic syndrome and related metabolic phenotypes exists between the site and drinking behavior. The effect is mainly in the drinking crowd. Four, (1) the genotype and metabolic syndrome of rs651821 (APOA5) and rs671 (ALH2) loci in Chinese Han population Genetic susceptibility is related; (2) rs180326 (BUD13) genotype is related to serum TG level, its effect is independent of the known site rs651821 (APOA5); (3) the genotype of rs671 (ALDH2) has interaction effects with drinking behavior. The second part is based on the annotated information of glycolipid Xie Xiang pass genetic mutation screening and fine location one. The purpose of this study is to study the first part of the study On the basis of the study, the genetic variation loci related to glycolipid metabolic indices were further screened and verified with multi group annotation information, and the newly discovered genetic mutation sites related to glycolipid metabolism were fine located, and the functional loci were confirmed by functional experiments. Two, materials and methods: first, the glycolipid of the 1742 samples (the same part) was obtained. Genetic variation sites associated with phenotypic but not verified by the first part of metabolism; then the association of these sites with the gene expression levels in the fat, liver, islets, and skeletal muscles and the co location of signals from these sites with phenotypic correlation signals to identify eQ related to the glycolipid metabolic phenotype. TLs; further verify the association between genotypes and phenotypes by multistage independent samples; to verify the locus of the positive level of the whole genome and to speculate on the possible regulatory genes by the analysis strategy (gene-based analyses) based on the basis of the unit; and the chromatin state in the related tissue cells provided by the ENCODE and Roadmap plan, and the group of eggs White modification, the transcription factor binding signal conjectured the active region and its corresponding effect loci; finally, the vector containing the different alleles was constructed, and the expression regulation was confirmed by the luciferase reporter gene experiment. Three, the results were found to be related to the metabolism of glycolipid by whole genome association analysis and eQTL annotation. Genetic variation loci affect gene expression in specific tissues. Multi stage independent sample validation confirms the association of rsl880118 loci with serum HDL-C levels (Pcombined= 1.4E-10). The loci can mainly include two genes of DAGLB and RAC1. Under the additive model, the genotype of rsl880118 can explain DAGLB (diacylglycerol Li). Pase, beta) gene expression in subcutaneous adipose tissue is 47.7% (P =5.9E-42). At the same time, we find that there is a association between the D4GLB gene and the serum HDL-C level by TWAS, SMR, Sherlock and so on. The P values of the associated P are 3.0E-08,1.1E-04 and 1.6E-06. further through the histone signals. E3, H3K9ac and transcription factor binding region, the location of DNA enzyme I hypersensitivity loci, found a regulatory active fragment located in the 5'region of the DAGLB gene. The results of the luciferase reporter gene show that the rs4724806 locus (and rs1880118 LD R2 = 0.77) in the active fragment may be a true effect site, and its minimum allele will increase the transcriptional activity. Sex, in accordance with the results of eQTL analysis. Four, summary: rsl880118 is a newly discovered genetic locus of variation associated with serum HDL-C levels in Chinese Han population. Its effect locus rs4724806 affects the expression of DAGLB gene by regulating transcriptional activity. This part of the study suggests the role of A4GLB gene in lipid metabolism. The third part is used in this study. The multi group annotation system parses the genetic variation related to glycollipid metabolism. The purpose of this study is to systematically analyze the known region of genetic variation signals related to glycolipid metabolism by combining multi group annotation information to explore the regulation path of genotype to phenotype to provide support for subsequent functional studies. Two, research methods: first, the report has been reported. Genetic variation loci associated with glycolipid metabolism; then, using interloci linkage disequilibrium information provided by the Millennium genome project to fill out all the highly linked unbalanced sites with lead SNP for subsequent fine location and functional prediction; and then systematically annotate these loci, including gene expression, chromatin State, DNA methylation, histone modification, transcription factor binding, etc., annotations to the genetic variation loci in the coding region, interspecies conservatism estimation, post translation modification and other translation levels; finally, the possible genotype to phenotypic regulation path was conjectured. Three, research results: 592 glycolipids obtained after screening and filtering. After the metabolism related genetic mutation sites were filled, a total of 17646 loci were included in the follow-up fine location analysis (LD r20.5). At the transcriptional level, 104 loci related to the gene expression level in visceral fat, liver, skeletal muscle, or islet cells were found by the annotation related to genetic variation and gene expression. EQTLs related to specific environmental stimuli, such as the association of rs702485 loci with DAGLB gene expression, only occurs after LPS stimulation (Pbefore0.05, Pafter=2.52E-16).133 glycolipid related sites are associated with DNA methylation in adipose tissue or islet cells, many of which are related to the level of methylation at multiple CpG sites. We found that 49 loci can associate (tag) to one or more loci in the peak region of the histone modified signal, and have tissue specificity. At the translation level, 122 (r20.5) or 43 (r20.8) loci can be associated with one or more of the non synonymous mutation sites, of which 16 loci are in SI. FT and Polyphen notes suggest the possibility of affecting protein function. After the fine localization and function prediction of the glucose metabolism related site rsl535500, the mutation of the site G to T is associated with the existence of the 7 CpG loci near the KCNK17 gene, which is near the CpG island of the 5'terminal of the KCNK17 gene, and the possibility of the DNA methylation is possible. It is found that this site can affect the expression of KCNK17 by influencing the methylation level of the nearby loci, which is different from the previous reports that the signal is mainly mediated by KCNK16. Four, a summary: (1) a number of GWAS signals are reinterpreted by integrating a number of Informatics, and many can be found. Genes and regulatory elements that can participate in genetic variation affecting the pathway of phenotypic regulation. Similar to other complex phenotypes, after annotation, 1/3 of the genes are consistent with the previous reports. (2) only 7%-20%'s glycolipid phenotype related genetic variation loci can be associated with one or more unsynonymous or nonsense mutations. The phenotype may be affected by transcription level. (3) by mapping the site of rs1535500 related to glycometabolism, it is found that the locus G to T

【学位授予单位】:浙江大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:R589

【参考文献】

相关期刊论文 前1条

1 顾东风,Reynolds K,杨文杰,陈恕凤,吴锡桂,段秀芳,蒲晓东,徐丽华,吴先萍,陈祥福,魏仁敏,陈娜萦,吴天一,王礼桂,姚才良,牟建军,马义峰,王晓飞,Whelton P,何江;中国成年人代谢综合征的患病率[J];中华糖尿病杂志;2005年03期



本文编号:1796045

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/yxlbs/1796045.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户4c91d***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com