CTCF介导不同类型染色质相互作用的研究
本文选题:CTCF + 染色质相互作用 ; 参考:《中国人民解放军军事医学科学院》2015年博士论文
【摘要】:CCCTC序列结合因子,即CTCF,是一种广泛表达的多功能锌指蛋白,其序列高度保守,目前被认为是脊椎动物中唯一的绝缘子调控蛋白。CTCF最早被发现为鸡c-myc基因的转录阻遏物,后续研究不断揭示CTCF参与多种生物学过程,具有不同功能,主要包括:基因转录的抑制与活化、绝缘子调控、印迹基因调控、X染色体失活、影响m RNA可变剪接、影响核小体重排、影响DNA复制与重组等。CTCF表达异常往往与多种疾病、肿瘤的发生发展等密切相关。通过自身11个锌指结构的不同组合,CTCF能够选择性识别数万个靶位点,其靶位点在基因组内广泛分布,并且CTCF与靶位点的结合跟DNA甲基化状态存在动态调控机制,进而形成细胞类型特异性结合位点,同时,不同位点间还存在不同的保守性和序列特征。除了具有众多的结合位点,通过自身蛋白序列N-端和C-端的内在无序结构特点以及多种翻译后修饰等方式,CTCF还可以与众多不同蛋白、甚至RNA直接相互作用。另外,CTCF还被证实能够结合于细胞有丝分裂期染色体,可能作为有丝分裂书签因子发挥作用,但具体分子机制及潜在功能仍有待探索。作为染色质三维高级结构的主要组织者,依靠众多的结合位点以及其它蛋白、甚至RNA的帮助,CTCF能够介导广泛的染色质内、染色质间相互作用,帮助建立和维持染色质高级构象,进而发挥其多种多样的功能。但目前为止,对于CTCF结合染色质上的不同类型结合位点、介导不同类型染色质相互作用、发挥多种多样的不同类型生理功能三者之间的关系还停留在个别现象或有限局部数据分析上,其内在的对应关系、生物学意义及背后的生物学逻辑仍有待全面、系统地深入研究。本文利用包括人类基因组CTCF结合位点在内的来源于ENCODE等的Ch IP数据、介导染色质相互作用的Ch IA-PET数据,来源于mod ENCODE等的果蝇基因组d CTCF结合位点及其随有丝分裂周期变化的Ch IP数据,以及其它转录因子、组蛋白表观修饰谱等多层次、多维度的组学数据,结合多种基因组信息数据库,通过多种生物信息学分析工具及算法,从“结合位点”、“相互作用”、“有丝分裂周期传递”三方面对CTCF不同类型的全基因组图谱进行了细致而全面的系统生物学分析。第一部分,我们从ENCODE数据库中整理得到人类基因组70个细胞系共162,209个CTCF结合位点,分析提取得到其三段式结合基序(three-part motif),并且发现结合位点的细胞类型特异性、基序motif显著性、序列保守性三者之间存在一定的相关性,即细胞类型非特异性越高的位点,其motif显著性越强,其保守性越高。我们认为:在大多数细胞中共有的非特异性结合位点,即组成性位点,可能作为基因组内的关键性节点,在进化选择压力下,最终维持高保守性,同时,显著的motif能够提供更高的亲和力,从而与CTCF蛋白更稳定地结合。随后,对于CTCF靶基因的功能学分析结果显示:细胞类型特异性CTCF结合位点可能与细胞类型特异性基因表达调控相关联,非特异性位点靶基因则对应于所有细胞都必需的基本生物学功能。同时,我们从新的角度,对160个能够与CTCF结合位点共定位的转录因子进行聚类分析,首次将其分为四种不同类型,提示CTCF能够在不同类型结合位点募集到不同类型的转录因子,进而发挥不同的生物学功能。第二部分内容着眼于人类基因组内CTCF参与介导的染色质相互作用。我们首先总结出CTCF结合位点的细胞类型非特异性(组成性)是其参与介导染色质相互作用的充要条件,即非特异性的结合位点倾向于参与相互作用,而且参与相互作用的位点也大多是细胞类型高度非特异性的。进一步地,我们从拓扑学的角度,发现细胞类型高度非特异性的CTCF结合位点往往处于染色质相互作用网络的中心位置,在维持网络的连通性、稳定性、鲁棒性等过程中发挥关键性作用,我们认为这类型结合位点可能充当染色质三维结构的底盘或骨架节点。此外,我们首次发现CTCF能够建立一种特殊的“多对多”相互作用结构,该结构可能参与构建和维持小鼠物种特异性染色质区域的三维结构,但是在人类基因组中的作用和意义还有待进一步探索。另一方面,通过整合不同转录因子共定位、组蛋白表观修饰谱、染色体状态等数据,我们将CTCF介导的染色质相互作用分为不同类型,同时发现cohesin复合物和ZNF143分子可能在帮助CTCF建立和维持稳定的染色质相互作用中发挥重要功能,其中cohesin已有前人相关文献报道,而ZNF143的具体功能机制还有待进一步研究。综合以上结果,我们认为CTCF在其介导的染色质相互作用网络中,能够起到底盘式的功能,选择性识别特定靶位点,建立染色质相互作用关系,并在cohesin复合物和ZNF143的帮助下,形成稳定的、不同类型的相互作用,其中一部分作为染色质三维结构的骨架,维持染色质构象的稳定性,另一部分进一步募集其他转录因子,产生不同的基因转录调控机制。第三部分着眼于考察CTCF与染色质的结合在细胞身份(cell identity)的建立与维持中的重要作用。我们以有丝分裂为切入点,以模式生物果蝇为研究对象,分类讨论细胞有丝分裂周期不同阶段,d CTCF在果蝇基因组上结合情况的变化及相应的生物学意义。我们首先将d CTCF位点分为间期—分裂期共有(IM)、间期特有(IO)、分裂期特有(MO)三类。通过motif、GC含量、保守性等序列特征分析,发现MO位点与IM、IO存在不同序列特征,我们推测是由于染色体在有丝分裂期大量凝缩,形态环境、物理学特性等可能与间期完全不同所致,d CTCF结合于分裂期染色体特有的那部分结合位点即MO位点可能基于一种新的分子机制。同时还发现IM位点具有明显高于其它类型位点的motif强度、Ch IP信号强度及保守性,且相关靶基因涉及到多种生物学过程,既跟细胞基本生理活动相关,又与有丝分裂进程密切联系,充分说明d CTCF稳定结合于IM位点对于维持正常的细胞活动,特别是细胞身份的建立与维持等具有重要意义。随后我们进一步发现,相同类型的d CTCF位点呈现相互聚集倾向,并且富集有d CTCF结合位点的TAD结构域边界以及果蝇基因组保守性结构域边界,同样也具有这种“相同类型聚集”(物以类聚)的倾向。我们认为d CTCF在有丝分裂周期不同阶段可能参与建立不同的TAD结构,同时还帮助维持基因组保守性区域的高级构象。基于上述结果,我们推测CTCF在有丝分裂期结合于染色体的可能生物学意义主要有以下几个方面:一方面,CTCF通过与那些需要在有丝分裂期发挥特定功能的靶基因的相应结合位点结合,参与这些特殊基因的转录调控,同时CTCF还可以在有丝分裂M/G1期过渡后快速建立起新的转录调控网络以利于细胞活动;另一方面,CTCF在分裂期可能参与建立分裂期特异性的染色质构象,并在细胞进入间期后帮助染色质迅速重建新的三维结构。综合上述三部分内容,本文通过从横向的有丝分裂间期的点(结合位点)到面(相互作用网络),再到纵向的整个细胞周期动态变化过程的完整研究模式,系统构建基因组内CTCF全方位、多层次图谱,深度解析CTCF的功能及其背后的生物学意义,对CTCF的相关研究提供了重要的实验资源和理论依据。
[Abstract]:CCCTC sequence binding factor, CTCF, is a widely expressed multifunction zinc finger protein. Its sequence is highly conserved. At present, it is considered that the only insulators regulated protein.CTCF in vertebrates is first discovered as a transcriptional repressor of the chicken c-myc gene. Follow up studies continue to reveal that CTCF is involved in a variety of biological processes and has different functions. The inhibition and activation of gene transcription, the regulation of insulators, the regulation of the imprinted gene, the inactivation of the X chromosome, the influence of the variable splicing of the m RNA, the influence of the nuclear small body weight, and the influence of the.CTCF expression on the replication and recombination of DNA are closely related to the occurrence and development of a variety of diseases and tumors. Through the different combinations of the 11 zinc finger structures of their own, CTCF can The selective identification of tens of thousands of target loci, the target loci are widely distributed in the genome, and the combination of CTCF with the target site and the DNA methylation state has a dynamic regulation mechanism, and then forms the specific binding site of the cell type. At the same time, there are different conservatism and sequence characteristics between different loci. Besides, there are numerous binding sites. CTCF can interact directly with many different proteins, even RNA, over the intrinsic disorder structure characteristics of the N- and C- ends of the autologous protein sequence and a variety of post-translational modifications. In addition, CTCF has also been proved to be able to bind to cell mitotic chromosomes and may play a role as a mitotic bookmark factor, but the specific molecular machine As the main organizer of the three-dimensional structure of chromatin, the main organizer of the three dimensional structure of chromatin, with the help of many binding sites and other proteins, and even RNA, CTCF can mediate a wide range of chromatin, interchromatin interaction, help to establish and maintain the advanced conformation of chromatin, and then play a variety of functions. So far, the relationship between the different types of chromatin interaction between CTCF and different types of chromatin interaction, and the relationship between the various types of different types of physiological functions, and the internal relationship, the biological significance and the biological logic behind the relationship between the different types of chromatin interaction and the analysis of three different types of physiological functions In this paper, we use Ch IP data, including the CTCF binding site of the human genome, and other Ch IP data, to mediate the Ch IA-PET data of chromatin interaction, derived from the D CTCF binding site of the Drosophila genome, and its Ch IP data with the mitotic cycle, as well as other transcripts, as well as other transcripts. Factor, histone apparent modification spectrum, multilevel, multidimensional group data, combined with a variety of genomic information databases, through a variety of bioinformatics analysis tools and algorithms, from "binding sites", "interaction", "mitosis cycle transfer" three parties to the whole genome of different types of CTCF in detail and comprehensive. In the first part, we collate 162209 CTCF binding sites in the 70 human genome of the human genome from the ENCODE database, and analyze and extract the third segment binding sequence (three-part motif), and find the cell type specificity of the binding site, the motif significance of the sequence motif, and the sequence conservatism between the three. In a certain correlation, the higher the non specificity of the cell type, the stronger the motif, the higher the conservativeness. We think that the non specific binding sites in most cells, that is, the constituent loci, may be the key nodes within the genome, and ultimately maintain high conservatism under the pressure of evolutionary selection, at the same time, The motif can provide a higher affinity and more stable binding with CTCF protein. Subsequently, the functional analysis of the CTCF target gene shows that the cell type specific CTCF binding site may be associated with the regulation of cell type specific gene expression, and the non specific site target genes correspond to the essential bases for all cells. At the same time, we cluster analysis of 160 transcriptional factors that can co loci with CTCF binding sites from a new perspective. For the first time, we divide them into four different types, suggesting that CTCF can raise different types of transcription factors at different types of binding sites and play different biological functions. Second parts of the contents are used. Focusing on the interaction of chromatin mediated mediated by CTCF in the human genome, we first conclude that the non specificity (composition) of the cell types of the CTCF binding site is the necessary and sufficient condition for its involvement in mediating chromatin interaction, that is, non specific binding sites tend to participate in interaction, and most of the sites involved in interaction are also involved. Further, from the topological point of view, we find that the highly non specific CTCF binding sites of cell types are often at the center of the chromatin interaction network and play a key role in maintaining connectivity, stability, and robustness in the network, and we consider this type of binding site. The point may act as a chassis or skeleton node for the three-dimensional structure of chromatin. In addition, we have discovered for the first time that CTCF can build a special "multi to many" interaction structure that may be involved in building and maintaining the three-dimensional structure of the specific chromatin region of the mouse species, but the role and significance in the human genome is still to be advanced. On the other hand, we divide the CTCF mediated chromatin interaction into different types by integrating different transcription factors co location, histone epigenetic modification spectrum, chromosome state and other data, and find that cohesin complex and ZNF143 molecules may play an important role in helping CTCF to establish and maintain stable chromatin interactions. Function, in which cohesin has been reported in previous literature, and the specific functional mechanism of ZNF143 remains to be further studied. Combining the above results, we think that CTCF can function as a chassis type in its mediated chromatin interaction network, selectively identify specific target points, establish chromatin interaction, and recover in cohesin. With the help of compound and ZNF143, a stable, different type of interaction is formed, one part of which serves as the skeleton of the three-dimensional structure of chromatin, maintains the stability of the chromatin conformation, and the other further raises other transcription factors to produce different gene transcriptional regulation mechanisms. The third part focuses on the combination of CTCF and chromatin. The important role in the establishment and maintenance of cell identity (cell identity). We take the mitosis as the breakthrough point, take the model biological Drosophila as the research object, and discuss the changes of the D CTCF binding in the Drosophila genome and the corresponding biological significance. We first divide the D CTCF site into interphase. IM, interphase specific (IO), split period endemic (MO) three classes. Through the analysis of motif, GC content, conservatism and other sequence characteristics, we found that MO loci and IM, IO have different sequence characteristics, we speculate that a large amount of condensation, morphological environment, physical properties, etc., in the mitosis period, may be completely different from the interval, D CTCF. The MO locus, which is specific to the division of chromosomes at the split stage, may be based on a new molecular mechanism. It is also found that the IM site is significantly higher than the motif intensity of other types of loci, the intensity and conservatism of the Ch IP signal, and the related target genes are involved in a variety of biological processes, which are related to the basic physiological activities of the cells. In close connection with the mitosis process, it is important to demonstrate that D CTCF is stable binding to the IM site for maintaining normal cell activity, especially the establishment and maintenance of cell identity. Then we further found that the same type of D CTCF loci showed mutual aggregation tendency and enriched the TAD domain with D CTCF binding sites. We believe that D CTCF may participate in the establishment of different TAD structures at different stages of the mitosis cycle, while also helping to maintain the advanced conformation of the conserved region of the genome. The possible biological significance of binding to chromosomes at mitosis stage is mainly in the following aspects: on the one hand, CTCF is involved in the transcription regulation of these special genes by combining with the corresponding binding sites that require specific function of the target genes during mitosis, and CTCF can also be fast after the mitotic M/G1 phase transition. On the other hand, CTCF may participate in the establishment of a split phase specific chromatin conformation at the split stage and help chromatin to quickly reconstruct a new three-dimensional structure at the interval. The three parts of the content are synthesized through a crosswise mitotic point (binding). The complete research pattern of the site (interaction network) and the dynamic process of the whole cell cycle in the longitudinal direction, and the systematic construction of CTCF omnibearing, multilevel atlas, depth analysis of the function of CTCF and the biological significance behind it, provide important experimental resources and theoretical basis for the related research of CTCF.
【学位授予单位】:中国人民解放军军事医学科学院
【学位级别】:博士
【学位授予年份】:2015
【分类号】:Q343
【相似文献】
相关期刊论文 前9条
1 邱平,易禄康;大鼠老化过程大脑皮层及小脑神经元染色质构象分析[J];生物化学杂志;1990年05期
2 邱平,易禄康;大鼠老化过程大脑皮层细胞核、染色质转录研究[J];生物化学杂志;1990年04期
3 亓合媛;张昭军;李雅娟;方向东;;染色质构象调控真核基因的表达[J];遗传;2011年12期
4 陈石根,周润琦,郑升;正常及白血病小鼠白细胞染色质和DNA的酶切分析[J];生物化学杂志;1987年03期
5 周国岭;宋崴;付湘辉;辛立;唐晓彬;冯冬晓;刘光;刘德培;;转基因鼠中人α类珠蛋白基因簇染色质构象调控变化[J];中国医学科学院学报;2007年03期
6 刘宏德;罗坤;马昕;翟金城;谢建明;孙啸;万亚坤;;核小体及染色质修饰的基因组分布模式和染色质状态[J];生物化学与生物物理进展;2013年11期
7 杨华;樊红;;基于分子结构的lncRNA分子功能的研究进展[J];东南大学学报(医学版);2014年01期
8 高雅;齐锦生;栗彦宁;;ChIA-PET技术[J];生命的化学;2011年01期
9 ;[J];;年期
相关博士学位论文 前3条
1 周国岭;细胞分裂中染色质活性和转录状态记忆机制的研究[D];中国协和医科大学;2006年
2 沈文龙;CTCF介导不同类型染色质相互作用的研究[D];中国人民解放军军事医学科学院;2015年
3 孔卫青;家蚕含染色质域基因的鉴定与MSL-3同源体和Bm-MOF基因的功能研究[D];西南大学;2007年
相关硕士学位论文 前1条
1 冯想想;HeLa细胞染色质结构在细胞间期的动态变化[D];东北师范大学;2013年
,本文编号:1963508
本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/1963508.html