基于广义伪氨基酸组成的蛋白质序列的数值刻画
发布时间:2018-06-26 04:46
本文选题:广义伪氨基酸组成 + 数值刻画 ; 参考:《渤海大学》2017年硕士论文
【摘要】:随着人类基因组计划与其它生物基因组测序工作的相继实施和完成,生物学数据呈指数增长,生物学研究的重点也由积累数据转化为分析解释这些呈指数增长的数据,生物信息学便应运而生。生物序列的比较与分析技术在生物信息学的领域里扮演着越来越重要的角色,而要发展这项技术关键的一步就是确定一个适当的方法来表示生物序列。本文针对这一问题在蛋白质的序列表示方面进行研究,具体内容如下:基于氨基酸3个重要的物理化学性质,将一条蛋白质原始序列约化成一条6字母序列,然后提取出一组反应整体和局部序列序信息的元素。之后把这些元素和20种天然氨基酸在序列出现的频率结合起来,从而构造出一个(21+λ)维特征向量来刻画蛋白质序列。在系统发育分析和DNA结合蛋白识别方面的应用证实了我们所提方法的有效性。
[Abstract]:With the implementation and completion of the Human Genome Project and the sequencing of other biological genomes, biological data have increased exponentially, and the emphasis of biological research has been transformed from accumulating data to analyzing and explaining the exponential growth of these data. Bioinformatics came into being. The technology of biological sequence comparison and analysis plays an increasingly important role in the field of bioinformatics, and the key step to develop this technology is to determine an appropriate method to represent biological sequence. In this paper, the sequence representation of proteins is studied. The main contents are as follows: based on the three important physicochemical properties of amino acids, a protein original sequence is reduced to a 6-letter sequence. Then a set of elements that respond to global and local sequence information are extracted. Then we combine these elements with the frequency of 20 kinds of natural amino acids in the sequence and construct a (21 位) dimensional eigenvector to characterize the protein sequence. Applications in phylogenetic analysis and DNA-binding protein recognition confirm the effectiveness of our proposed method.
【学位授予单位】:渤海大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q51;Q811.4
【参考文献】
相关期刊论文 前1条
1 马飞,武耀廷,许晓风;遗传密码子和氨基酸若干物理化学特性的相关性研究[J];安徽农业大学学报;2003年04期
,本文编号:2069165
本文链接:https://www.wllwen.com/shoufeilunwen/benkebiyelunwen/2069165.html