统计势能函数及其在蛋白质结构预测中的应用
发布时间:2018-09-06 20:11
【摘要】:蛋白质结构预测是生物学中热点问题之一,也是目前最有可能减少已知蛋白编码序列和已解析蛋白质结构数目之间巨大差距的方法,对蛋白质的结构和功能研究非常重要。作为主流的结构预测方法,从头开始结构预测主要包括两个方面,即势能函数和构象搜索算法。势能函数是蛋白质三级结构预测领域最困难的问题之一。研究表明,传统上基于物理的势能函数在蛋白质结构预测中效果较差,而基于统计分布的势能函数在计算速度和准确率方面都有较明显的优势。尽管已有大量的研究,但目前的统计势能函数仍然不能满足预测需求。一方面,对于常用的距离依赖性统计势能函数来说,其中最为关键的参考态目前比较理想化,没有考虑到肽链的真实环境,因而其效果难以进一步提升。另一方面,当前已经有大量的蛋白质结构实验数据,这为更为复杂的多维统计势能函数提供了可能。针对这种现状,作者首先提出使用蛋白质去折叠态作为参考态构建了距离依赖型统计势能函数SPOUSE。由于去折叠态仅包含肽链的基本特性,而极少包含特异性相互作用,因此不仅从理论上将蛋白质结构预测和蛋白质折叠统一起来,而且测试结果表明其效果较现有同类函数有明显的提升。随后,作者进一步改进了距离依赖型统计势能函数,提出了同时基于距离和角度等位置取向信息的多维统计势能ORDER_AVE。由于ORDER_AVE在一定程度上考虑了多体效应,因而相对于SPOUSE有非常显著的提升。不仅如此,与其它取向依赖型函数相比,ORDER_AVE也有更高的识别准确率。与此同时,作者还设计了多种其他统计势能函数,包括软核范德华、氢键、疏水作用、β折叠股成对作用和接触能等,并在三个不同类型的蛋白中测试了每种函数对预测结果的影响。上述多种能量函数均被集成到最后的结构预测程序中,而系统的总能量为所有能量函数的加权和。最后,在前人工作的基础上,作者参与设计多种的构象搜索算法和技巧,并使用C++编写了一个蛋白质结构预测程序。初步结果表明,我们的程序对于α蛋白和α/β蛋白的预测效果较好。因此,本研究中我们不仅设计了两个效果优异的统计势能函数,而且基于编写的程序,我们将持续在该领域发挥重要的作用。
[Abstract]:Protein structure prediction is one of the hot topics in biology. It is also the most likely method to reduce the huge gap between the number of known protein coding sequences and the number of resolved protein structures. Potential energy function (PEF) is one of the most difficult problems in the field of protein tertiary structure prediction. The results show that the traditional physical-based PEF is not effective in protein structure prediction, while the statistical distribution-based PEF has obvious advantages in computing speed and accuracy. Although a lot of research has been done, the statistical potential energy function can not meet the demand of prediction. On the one hand, the most important reference state of the distance-dependent statistical potential energy function is currently idealized without considering the real environment of the peptide chain, so its effect is difficult to further improve. A large number of experimental data on protein structure have been obtained, which makes it possible to construct a more complex multidimensional statistical potential energy function. In view of this situation, the authors first proposed to construct a distance-dependent statistical potential energy function SPOUSE using protein unfolded states as reference states. Specific interaction, therefore, not only unifies the prediction of protein structure and protein folding theoretically, but also improves the performance of the proposed method. Subsequently, the distance-dependent statistical potential energy function is further improved, and the location orientation information based on distance and angle is proposed. Multidimensional statistical potential energy ORDER_AVE. Because ORDER_AVE considers the multibody effect to a certain extent, ORDER_AVE has a very significant improvement over SPOUSE. Moreover, ORDER_AVE has a higher recognition accuracy than other orientation-dependent functions. Meanwhile, the author also designs a variety of other statistical potential energy functions, including soft-core Vander De. The effects of each function on the prediction results were tested in three different types of proteins. The various energy functions were integrated into the final structure prediction program, and the total energy of the system was the weighted sum of all the energy functions. Finally, the basis of previous work was given. On this basis, the author participated in the design of a variety of conformation search algorithms and techniques, and used C++ to write a protein structure prediction program. The preliminary results show that our program for alpha protein and alpha/beta protein prediction effect is better. Therefore, in this study, we not only designed two excellent statistical potential energy functions, but also based on the preparation of the program. We will continue to play an important role in this field.
【学位授予单位】:清华大学
【学位级别】:博士
【学位授予年份】:2015
【分类号】:Q51
,
本文编号:2227413
[Abstract]:Protein structure prediction is one of the hot topics in biology. It is also the most likely method to reduce the huge gap between the number of known protein coding sequences and the number of resolved protein structures. Potential energy function (PEF) is one of the most difficult problems in the field of protein tertiary structure prediction. The results show that the traditional physical-based PEF is not effective in protein structure prediction, while the statistical distribution-based PEF has obvious advantages in computing speed and accuracy. Although a lot of research has been done, the statistical potential energy function can not meet the demand of prediction. On the one hand, the most important reference state of the distance-dependent statistical potential energy function is currently idealized without considering the real environment of the peptide chain, so its effect is difficult to further improve. A large number of experimental data on protein structure have been obtained, which makes it possible to construct a more complex multidimensional statistical potential energy function. In view of this situation, the authors first proposed to construct a distance-dependent statistical potential energy function SPOUSE using protein unfolded states as reference states. Specific interaction, therefore, not only unifies the prediction of protein structure and protein folding theoretically, but also improves the performance of the proposed method. Subsequently, the distance-dependent statistical potential energy function is further improved, and the location orientation information based on distance and angle is proposed. Multidimensional statistical potential energy ORDER_AVE. Because ORDER_AVE considers the multibody effect to a certain extent, ORDER_AVE has a very significant improvement over SPOUSE. Moreover, ORDER_AVE has a higher recognition accuracy than other orientation-dependent functions. Meanwhile, the author also designs a variety of other statistical potential energy functions, including soft-core Vander De. The effects of each function on the prediction results were tested in three different types of proteins. The various energy functions were integrated into the final structure prediction program, and the total energy of the system was the weighted sum of all the energy functions. Finally, the basis of previous work was given. On this basis, the author participated in the design of a variety of conformation search algorithms and techniques, and used C++ to write a protein structure prediction program. The preliminary results show that our program for alpha protein and alpha/beta protein prediction effect is better. Therefore, in this study, we not only designed two excellent statistical potential energy functions, but also based on the preparation of the program. We will continue to play an important role in this field.
【学位授予单位】:清华大学
【学位级别】:博士
【学位授予年份】:2015
【分类号】:Q51
,
本文编号:2227413
本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/2227413.html