特殊顺式调控元件与染色质三维结构
发布时间:2018-08-23 10:50
【摘要】:对于真核生物,染色质是最大的单一生物大分子,是基因组的存在形式,是遗传和表观遗传的共同载体,而细胞核是染色质的储存空间。在有丝分裂间期,不同的染色质在细胞核内分别占据一个相对独立但又相对固定的空间位置即染色体领域CT(Chromatin Territory),每条染色质进一步分层级折叠成复杂而有序的三维高级结构,这些稳定的高级结构及其核内空间定位与DNA重组与复制、基因的转录甚至翻译等密切相关。大量研究发现,很多调控因子都是依赖着染色质三维空间结构来发挥功能。染色体构象捕获技术3C(Chromosome Conformation Capture)是2002年出现的研究染色质折叠的核心生化技术,近年来随着3C的衍生技术如4C、5C、Hi-C、Capture-C以及考察特定蛋白介导染色质互作的ChIA-PET等分子图谱(Mapping)以及DNA-FISH等不同类型的成像(Imaging)技术的迅速发展,人们对染色质折叠的认识不断加深。目前普遍认为:有丝分裂间期染色质的折叠包含染色质领域(chromosome territory)、拓扑结构域(topological associated domain,TAD)、染色质环(chromatin loop)等从大到小不同层级的结构单元,作为基因组序列及其DNA和组蛋白修饰、染色质调控蛋白的结合等的最终载体和响应器(carrierresponser)或“集大成者”(moderator),基因组的复杂三维高级结构与组蛋白修饰、DNA甲基化、DNase敏感性等基因组其它组学特征以及基因的复制、转录等基因组功能特征密切相关。顺式调控元件CREs(cis-regulatory elements)往往指的是与结构基因串联的特定非编码DNA序列,它们通过与转录因子等的结合而调控基因转录的精确起始和转录效率,从而参与基因表达的调控。根据序列组成等特性,可以将其分为常见的启动子、终止子、增强子、绝缘子、边界元件、基因组座位调控元件以及重复元件、核基质附着区等不同类型。其中重复元件(repeat elements)在基因组中多次重复出现,包含了大量的有着不同的结构和来源的DNA元件,在真核生物基因组中的比例可达到一半以上。根据片段长度及生物学特性,重复元件可进一步分成串联重复序列及转座元件等。由于受到测序读长、比对等技术手段的限制,人们对于重复元件在基因组中发挥的功能和机制目前还知之甚少,需要进一步发掘。核基质附着区(Matrix Attachment Regions,MARs)是一类在不同物种与组织细胞的基因组中广泛存在的、富含AT序列且与核基质紧密结合的DNA元件。目前认为MARs元件一方面可能作为染色质三维折叠的结构单元,对于建立和维持染色质领域发挥着作用;另一方面可能作为基因表达调控的功能性单元,通过与核基质等结合,调节基因表达。本研究以染色质三维结构实验数据为基础,通过多组学数据整合分析,从基因组序列特征出发,考察了多种不同顺式调控元件在染色质三维折叠中的可能作用,以及染色质三维结构对这些元件演化等潜在影响,为进一步深入了解不同元件在染色质三维高阶结构的建立、维持以及传递等过程中可能的作用奠定了基础。本论文的主要研究内容如下:开发了Hi-C数据处理的一站式软件HBP。建立了针对特殊顺式调控元件的研究方法,开发了能够系统考察特殊顺式调控元件参与的染色质相互作用的处理软件HBP(Hi-C BED file analysis Pipeline),通过简单的操作即可实现针对不同元件的处理和分析。HBP作为一款开源、灵活、高度优化的一站式处理平台,能够有效地分析全基因组三维高级结构特点及特殊序列元件参与的染色质相互作用,大大方便了针对特定相关区域染色质折叠的系统研究。作为一个简便、高效、可靠的工具,HBP能够通过整合组蛋白修饰一级Ch IP-seq、RNA-seq等其它组学数据,针对特定区域的染色质相互作用进行大规模系统挖掘与分析,对潜在分子机制以及生物学意义等进行多方面的研究。基于染色质三维结构对部分重复元件进行了研究。重复元件在各类物种基因组中广泛分布且种类繁多,不同种类之间序列功能各异。本研究探索了与染色质三维结构高度相关的一些重复元件子类,发现了包括Alu等在内的重复元件广泛地参与了染色质的三维折叠。首次从染色质三维结构的空间距离层面,系统考察了相邻重复元件子类间的演化过程,发现在一维序列演化关系上相对靠近的子类,在三维空间结构上的相互作用强度相对较高,提示重复元件的进化历程可能与基因组三维高级结构的形成密切相关。基于染色质三维结构对核基质附着区进行了研究。核基质附着区MARs(Matrix Associated Regions)在不同的物种间广泛存在,虽然存在AT序列相对富集等共同序列特征,却没有发现显著保守性等其它结构特征,此外其在细胞核内的功能和机制也尚无定论。本研究针对MARs元件,从相互作用频率分布、网络拓扑结构及潜在生物学功能等三方面进行了探索。发现MARs元件与染色质三维结构高度相关,且在高强度相互作用中占据着较大的比例,提示MARs元件对染色质折叠具有重要影响。同时,拓扑结构聚类分析证实MARs元件可以分为不同类型,包括了维持染色质领域及高级空间构象等方面的结构单元部分以及调控基因表达等方面的功能单元部分,提示不同类型MARs元件在基因组细胞核高级结构及功能上可能发挥不同的作用。综上所述,本课题以染色质三维结构为基础,以研究不同顺式调控元件在染色质三维结构中发挥的作用为目的,开发了特定的Hi-C数据分析处理软件HBP,建立了多组学数据的整合分析方法,并且对部分重复元件及核基质附着区与基因组三维高级结构的关系等进行了分析,发现了一些与染色质三维结构高度相关的现象,为进一步的染色质高阶构象及构成元件结构与功能等方面的研究奠定了基础。
[Abstract]:In eukaryotes, chromatin is the largest single biological macromolecule, a form of genome existence, a common carrier of heredity and epigenetics, and the nucleus is the storage space of chromatin. Chromatin Territory (CT), in which each chromatin is folded into complex and ordered three-dimensional higher-order structures, is closely related to DNA recombination and replication, gene transcription and even translation. Chromosome Conformation Capture (3C) is the core biochemical technique for studying chromatin folding that emerged in 2002. In recent years, with the development of 3C derivatives such as 4C, 5C, Hi-C, Capture-C, and molecular maps such as CHIA-PET for investigating specific protein-mediated chromatin interactions (Mapping) and DNA-FISH With the rapid development of various imaging techniques, people have a deeper understanding of chromatin folding. It is generally believed that chromatin folding in mitotic interphase includes chromatin domain, topological associated domain (TAD), chromatin loop and so on. Homologous structural units, as the ultimate carriers and responders of genome sequences, DNA and histone modifications, chromatin-regulated protein binding, etc., complex three-dimensional high-level structure of the genome and histone modification, DNA methylation, DNase sensitivity and other genomic characteristics Cis-regulatory elements (CREs) often refer to specific non-coding DNA sequences linked to structural genes, which regulate the precise initiation and transcriptional efficiency of gene transcription by binding to transcription factors, and thus participate in the regulation of gene expression. According to the characteristics of sequence composition, it can be divided into common promoters, terminators, enhancers, insulators, boundary elements, genomic locus regulatory elements and repeat elements, nuclear matrix attachment regions and other different types. Repeat elements are repeated in genomes, including a large number of different junctions. Repetitive elements can be further divided into tandem repeats and transposable elements according to their length and biological characteristics. Due to the limitation of sequencing, length reading, alignment and other technical means, the functions of repetitive elements in genomes are limited. Matrix Attachment Regions (MARs) are a class of DNA elements that exist widely in the genomes of different species and tissues and cells and are rich in AT sequences and closely bound to the nuclear matrix. Construction units play an important role in the establishment and maintenance of chromatin; on the other hand, they may act as functional units of gene expression regulation, regulating gene expression by binding to nuclear matrix and so on. The possible roles of various cis-regulatory elements in the three-dimensional folding of chromatin and the potential effects of the three-dimensional structure of chromatin on the evolution of these elements lay the foundation for further understanding the possible roles of different elements in the establishment, maintenance and transmission of three-dimensional high-order chromatin structures. The main contents are as follows: Hi-C data processing one-stop software HBP is developed. The research methods for special cis-regulatory elements are established. The processing software HBP (Hi-C BED file analysis Pipeline) is developed, which can systematically investigate the interaction of chromatin with special cis-regulatory elements. Different cis-regulatory elements can be realized by simple operation. As an open source, flexible and highly optimized one-stop processing platform, HBP can effectively analyze the three-dimensional high-level structural characteristics of the whole genome and chromatin interactions involving special sequence elements, greatly facilitating the systematic study of chromatin folding for specific related regions. By integrating histone-modified first-order Ch IP-seq, RNA-seq and other histological data, HBP can conduct large-scale systematic mining and analysis of chromatin interactions in specific regions, and study potential molecular mechanisms and biological significance in many aspects. Based on the three-dimensional structure of chromatin, some repetitive elements were studied. Repetitive elements are widely distributed in the genomes of various species, and their sequence functions vary from species to species. This study explored some subclasses of repetitive elements highly related to the three-dimensional structure of chromatin. Repetitive elements, including Alu, were found to be involved extensively in the three-dimensional folding of chromatin. On the level of spatial distance, we systematically investigated the evolutionary process between subclasses of adjacent duplicate elements, and found that the subclasses which are relatively close to each other in the evolutionary relationship of one-dimensional sequences have relatively high interaction intensity in the three-dimensional structure, suggesting that the evolutionary process of duplicate elements may be closely related to the formation of three-dimensional high-level structures of genomes. Matrix Associated Regions (MARs) are ubiquitous among different species. Although there are common sequence features such as relative AT sequence enrichment, no other structural features such as significant conservatism have been found. In addition, the function and mechanism of MARs in the nucleus are still absent. The results show that the MARs are highly correlated with the three-dimensional structure of chromatin and occupy a large proportion in the high-intensity interaction, suggesting that MARs play an important role in chromatin folding. Topological cluster analysis confirmed that MARs elements could be classified into different types, including structural units for maintaining chromatin domain and advanced spatial conformation, and functional units for regulating gene expression, suggesting that different types of MARs might play different roles in the higher structure and function of genomic nuclei. In summary, based on the three-dimensional structure of chromatin, and for the purpose of studying the role of different cis-regulatory elements in the three-dimensional structure of chromatin, a specific Hi-C data analysis and processing software, HBP, was developed. The integrated analysis method of multi-group data was established, and some duplicate elements and nuclear matrix attachment regions and genes were analyzed. Some phenomena highly correlated with the three-dimensional structure of chromatin were found, which laid a foundation for further study on the high-order conformation of chromatin and the structure and function of chromatin components.
【学位授予单位】:中国人民解放军军事医学科学院
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q78
,
本文编号:2198855
[Abstract]:In eukaryotes, chromatin is the largest single biological macromolecule, a form of genome existence, a common carrier of heredity and epigenetics, and the nucleus is the storage space of chromatin. Chromatin Territory (CT), in which each chromatin is folded into complex and ordered three-dimensional higher-order structures, is closely related to DNA recombination and replication, gene transcription and even translation. Chromosome Conformation Capture (3C) is the core biochemical technique for studying chromatin folding that emerged in 2002. In recent years, with the development of 3C derivatives such as 4C, 5C, Hi-C, Capture-C, and molecular maps such as CHIA-PET for investigating specific protein-mediated chromatin interactions (Mapping) and DNA-FISH With the rapid development of various imaging techniques, people have a deeper understanding of chromatin folding. It is generally believed that chromatin folding in mitotic interphase includes chromatin domain, topological associated domain (TAD), chromatin loop and so on. Homologous structural units, as the ultimate carriers and responders of genome sequences, DNA and histone modifications, chromatin-regulated protein binding, etc., complex three-dimensional high-level structure of the genome and histone modification, DNA methylation, DNase sensitivity and other genomic characteristics Cis-regulatory elements (CREs) often refer to specific non-coding DNA sequences linked to structural genes, which regulate the precise initiation and transcriptional efficiency of gene transcription by binding to transcription factors, and thus participate in the regulation of gene expression. According to the characteristics of sequence composition, it can be divided into common promoters, terminators, enhancers, insulators, boundary elements, genomic locus regulatory elements and repeat elements, nuclear matrix attachment regions and other different types. Repeat elements are repeated in genomes, including a large number of different junctions. Repetitive elements can be further divided into tandem repeats and transposable elements according to their length and biological characteristics. Due to the limitation of sequencing, length reading, alignment and other technical means, the functions of repetitive elements in genomes are limited. Matrix Attachment Regions (MARs) are a class of DNA elements that exist widely in the genomes of different species and tissues and cells and are rich in AT sequences and closely bound to the nuclear matrix. Construction units play an important role in the establishment and maintenance of chromatin; on the other hand, they may act as functional units of gene expression regulation, regulating gene expression by binding to nuclear matrix and so on. The possible roles of various cis-regulatory elements in the three-dimensional folding of chromatin and the potential effects of the three-dimensional structure of chromatin on the evolution of these elements lay the foundation for further understanding the possible roles of different elements in the establishment, maintenance and transmission of three-dimensional high-order chromatin structures. The main contents are as follows: Hi-C data processing one-stop software HBP is developed. The research methods for special cis-regulatory elements are established. The processing software HBP (Hi-C BED file analysis Pipeline) is developed, which can systematically investigate the interaction of chromatin with special cis-regulatory elements. Different cis-regulatory elements can be realized by simple operation. As an open source, flexible and highly optimized one-stop processing platform, HBP can effectively analyze the three-dimensional high-level structural characteristics of the whole genome and chromatin interactions involving special sequence elements, greatly facilitating the systematic study of chromatin folding for specific related regions. By integrating histone-modified first-order Ch IP-seq, RNA-seq and other histological data, HBP can conduct large-scale systematic mining and analysis of chromatin interactions in specific regions, and study potential molecular mechanisms and biological significance in many aspects. Based on the three-dimensional structure of chromatin, some repetitive elements were studied. Repetitive elements are widely distributed in the genomes of various species, and their sequence functions vary from species to species. This study explored some subclasses of repetitive elements highly related to the three-dimensional structure of chromatin. Repetitive elements, including Alu, were found to be involved extensively in the three-dimensional folding of chromatin. On the level of spatial distance, we systematically investigated the evolutionary process between subclasses of adjacent duplicate elements, and found that the subclasses which are relatively close to each other in the evolutionary relationship of one-dimensional sequences have relatively high interaction intensity in the three-dimensional structure, suggesting that the evolutionary process of duplicate elements may be closely related to the formation of three-dimensional high-level structures of genomes. Matrix Associated Regions (MARs) are ubiquitous among different species. Although there are common sequence features such as relative AT sequence enrichment, no other structural features such as significant conservatism have been found. In addition, the function and mechanism of MARs in the nucleus are still absent. The results show that the MARs are highly correlated with the three-dimensional structure of chromatin and occupy a large proportion in the high-intensity interaction, suggesting that MARs play an important role in chromatin folding. Topological cluster analysis confirmed that MARs elements could be classified into different types, including structural units for maintaining chromatin domain and advanced spatial conformation, and functional units for regulating gene expression, suggesting that different types of MARs might play different roles in the higher structure and function of genomic nuclei. In summary, based on the three-dimensional structure of chromatin, and for the purpose of studying the role of different cis-regulatory elements in the three-dimensional structure of chromatin, a specific Hi-C data analysis and processing software, HBP, was developed. The integrated analysis method of multi-group data was established, and some duplicate elements and nuclear matrix attachment regions and genes were analyzed. Some phenomena highly correlated with the three-dimensional structure of chromatin were found, which laid a foundation for further study on the high-order conformation of chromatin and the structure and function of chromatin components.
【学位授予单位】:中国人民解放军军事医学科学院
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q78
,
本文编号:2198855
本文链接:https://www.wllwen.com/shoufeilunwen/benkebiyelunwen/2198855.html