药物分子对接优化算法及在云平台中的应用

发布时间：2018-07-01 08:50

本文选题：分子对接 + 遗传算法　；参考：《大连理工大学》2014年博士论文

【摘要】：药物分子设计作为一种新的药物研究方法和手段,已经取得了很多理论与实际的研究发现。随着用于解释生物大分子的结构和功能的理论的不断发展和成熟,药物分子设计的理论和方法也在不断的深化与发展。计算机辅助药物设计是将药物分子设计与飞速发展的计算机技术相结合的一个新的交叉研究领域,有效地推动了药物分子设计的进一步发展。目前,计算机辅助药物设计已经成为进行新药探索的一种实用手段,可有效的降低药品的研究费用。分子对接方法是使用计算机技术来模拟配体与受体的相互作用与结合过程,是计算机辅助药物设计中非常重要和实用的一种研究方法。云计算是并行计算、分布式计算和网格计算的进一步整合和发展,是一种深度基于互联网的计算方式。通过云计算技术,可以把原来只能在本地使用的软件和服务扩展到互联网上,极大的方便了用户的使用。CUDA (Compute Unified Device Architecture)一种通用的并行计算架构,该架构使价格相对低廉的GPU (Graphic Processing Unit)也可以进行复杂的科学计算等工作。鉴于诸多的优势,云计算和CUDA并行计算都已成为目前的研究热点问题。本文的主要研究内容包括： (1)提出了一种使用k均值聚类方法划分残基集团和基于知识打分方法KScore的药物分子对接算法。对受体活性位点处的残基使用k均值聚类方法进行基团划分,通过对这些基团的运动考察来近似反映受体分子发生的柔性构象变化。使用KScore评分方法对构象的结合情况进行评分,使用信息熵遗传算法对优化模型进行迭代求解。仿真数值实验的结果表明,此方法对于测试集的平均对接时间较短,相比于其他流行对接方法而言其计算速度最快,并且其对接精度也相对较好。 (2)提出了一种基于多种评分函数的自适应药物分子对接算法。算法的评分函数选用基于力场、基于经验(抽取疏水和熵变)和基于知识作为评分贡献项,建立自适应对接优化模型对问题进行优化描述,使用信息熵多种群遗传算法对优化模型进行求解。此方法对多种打分贡献项进行自适应地综合评定,其评价结果更为准确和全面,同时避免了对训练集的过度依赖。对比于其他主流软件和方法而言,此方法对于测试集的测试结果的平均RMSD值非常小,在参评的对接方法中为精度最优方法。 (3)对于信息熵多种群遗传算法和分子对接中的受体评分网格生成算法需要较多计算时间的问题,提出了基于CUDA架构基础的并行算法。实现了信息熵遗传算法的遗传算子、惩罚函数和空间收缩因子等的并行计算。充分分析和挖掘受体评分网格算法的并行因素,结合GPU计算特点,进而提出了受体评分网格生成并行算法。实验结果表明,并行优化后的算法对比原算法具有较高的计算效率。 (4)为了有效整合计算生物学应用,提高软硬件资源的利用率和有效降低资源的使用难度,在基于前期网格相关工作的基础上,构建了面向服务的计算生物学云社区平台。针对本平台服务对象和自身资源特点等情况,提出了四层云框架结构并应用于云平台的建设。整合了课题组相关科研成果(包括本文相关成果),在进行资源虚拟化和应用服务化之上,以互联网方式对用户提供便捷而高效的使用平台。用户可在平台上进行包括分子对接和虚拟筛选等计算生物学相关工作,还可根据特定需求进行相关应用的流程化组织等。
[Abstract]:As a new drug research method and method , drug molecule design has been found . With the development and maturation of the theory of molecular design and function , the theory and method of drug molecule design have been deepening and developing . Computer - aided drug design has become a practical means to study drug molecule design . Computer - aided drug design has become a practical means to study drug molecule design .

Cloud computing is a further integration and development of parallel computing , distributed computing and grid computing . It is a deep - depth Internet - based computing approach . With cloud computing technology , software and services that can only be used locally can be extended to the Internet , which makes users more convenient to use . The architecture enables the relatively inexpensive GPU ( Graphic Processing Unit ) to perform complex scientific calculations .

The main research contents of this paper include :

( 1 ) A molecular docking algorithm using k - means clustering method to divide residue group and knowledge - based scoring method KScore is proposed . The residue of receptor active site is divided into groups by using k - means clustering method .

( 2 ) The self - adaptive drug molecular docking algorithm based on a plurality of scoring functions is proposed . The scoring function of the algorithm is based on the force field , based on experience ( extraction of hydrophobic and entropy change ) and knowledge as the score contribution term , and the optimization model is solved by using the information entropy multi - group genetic algorithm . The method has a more accurate and comprehensive evaluation result , and meanwhile avoids the over - dependence on the training set . Compared with other mainstream software and methods , the method has very small average RMSD value for the test result of the test set , and the method is the optimal method in the docking method of the evaluation .

( 3 ) To solve the problem of multiple computing time , the parallel algorithm of genetic operator , penalty function and space contraction factor of the information entropy genetic algorithm is proposed . The parallel factors of genetic operator , penalty function and space contraction factor of the information entropy genetic algorithm are realized . The parallel factors of the algorithm of the receptor scoring grid are analyzed and the parallel algorithm of the receptor scoring grid is fully analyzed . The experimental results show that the parallel algorithm has higher computational efficiency compared with the original algorithm .

( 4 ) In order to effectively integrate the computational biology application , improve the utilization rate of the software and hardware resources and effectively reduce the use difficulty of the resources , a four - layer cloud framework structure is constructed and applied to the construction of the cloud platform based on the related work of the previous grid .
【学位授予单位】：大连理工大学
【学位级别】：博士
【学位授予年份】：2014
【分类号】：TP18;TP393.09

【参考文献】