当前位置:主页 > 科技论文 > 数学论文 >

基于语义的蛋白质复合物识别算法的研究与应用

发布时间:2018-04-19 12:51

  本文选题:蛋白质相互作用网络 + 聚集系数 ; 参考:《西安理工大学》2017年硕士论文


【摘要】:近年以来,随着生物学的不断发展,推动了人类基因组计划的顺利完成,与此同时,系统生物学和蛋白质组学的研究也在不断的深入。因此,如何基于已知蛋白质相互作用网络的结构及生物特性,研究蛋白质复合物及其功能特性成为当下的一个研究热点。蛋白质复合物识别的算法就是用来解决这一问题,通过该方法可以挖掘有生物意义的蛋白质复合物,预测未知蛋白质的功能。本文详细介绍了蛋白质复合物识别的基本研究方法,主要包括基于图划分方法、基于层次的方法以及基于生物信息融合的方法。在此基础上,对蛋白质相互作用网络的结构为基点,结合了蛋白质复合物自身结构特点,提出了两个新的复合物识别算法。(1)提出了一个基于语义相似度的蛋白质复合物识别算法。由于目前多数复合物的识别算法是作用于无权蛋白质网络上,没有考虑到蛋白质之间固有的生物特性,这会对复合物识别的准确率产生了较大的影响。因此,本文提出了一种基于语义相似度的聚类算法---DSC算法。该算法首先构建蛋白质加权网络,在无权网络上基于边的聚集系数识别蛋白质复合物。实验证明,该算法取得良好的实验结果。(2)提出了一个基于关键节点分层扩展的蛋白质复合物识别算法。针对传统算法侧重于网络整体的拓扑结构,忽略了对复合物自身结构特点的研究。本文采用了网络关键节点选择、多层次扩展的方法来识别蛋白质复合物的方式。在分层扩展的过程中,使用我们构造的加权相互作用网络,以节点之间的语义相似度作为扩展的基础,提出了基于关键节点分层扩展的蛋白质复合物识别算法---KNHE算法,并将其应用在蛋白质加权网络中。由于算法充分考虑了已知关键蛋白质的重要性以及复合物自身的结构特点。实验结果显示,该算法在敏感性、特异性等方面都有很大的提升,实验取得了良好的结果。本论文提出的两个蛋白质复合物识别算法从不同角度出发,有效的解决了识别率低的问题,而且算法具有很好的聚类效果,识别的复合物普遍具有生物意义。
[Abstract]:In recent years, with the continuous development of biology, the human genome project has been successfully completed. At the same time, the research of system biology and proteomics is also deepening.Therefore, how to study protein complexes and their functional properties based on the structure and biological properties of known protein interaction networks has become a hot topic.The algorithm of protein complex recognition is used to solve this problem, by which we can mine protein complex with biological significance and predict the function of unknown protein.In this paper, the basic research methods of protein complex recognition are introduced in detail, including graph partitioning method, hierarchical method and biological information fusion method.On the basis of this, the structure of protein interaction network is taken as the basis point, and the structural characteristics of protein complex itself are combined.This paper proposes two new complex recognition algorithms. (1) A protein complex recognition algorithm based on semantic similarity is proposed.Because most of the current complex recognition algorithms act on the unweighted protein network and do not take into account the inherent biological characteristics between proteins, this will have a great impact on the accuracy of complex recognition.Therefore, a clustering algorithm-DSC based on semantic similarity is proposed in this paper.The algorithm firstly constructs a protein-weighted network and recognizes the protein complex based on edge aggregation coefficient on the unweighted network.Experimental results show that the algorithm achieves good experimental results. (2) A protein complex recognition algorithm based on delamination expansion of key nodes is proposed.The traditional algorithm focuses on the topology of the whole network and neglects the study of the structure of the complex itself.In this paper, the method of network key node selection and multilevel expansion is used to identify protein complex.In the process of delamination expansion, using the weighted interaction network constructed by us, and taking the semantic similarity between nodes as the basis of the extension, a protein complex recognition algorithm-KNHE algorithm based on delamination expansion of key nodes is proposed.It is applied to protein weighted network.The importance of known key proteins and the structural characteristics of the complexes are fully considered in the algorithm.The experimental results show that the algorithm has a great improvement in sensitivity and specificity, and good results have been obtained.The two protein complex recognition algorithms proposed in this paper effectively solve the problem of low recognition rate from different angles, and the algorithm has a good clustering effect, and the recognized complex has biological significance.
【学位授予单位】:西安理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q51;O157.5

【参考文献】

相关期刊论文 前4条

1 冀俊忠;刘志军;刘红欣;刘椿年;;蛋白质相互作用网络功能模块检测的研究综述[J];自动化学报;2014年04期

2 王巍;卢卫红;孙野青;;基于基因本体论的模式生物分子功能分布异同[J];生物信息学;2010年03期

3 程云辉;王璋;许时婴;;Antioxidant properties of wheat germ protein hydrolysates evaluated in vitro[J];Journal of Central South University of Technology(English Edition);2006年02期

4 刘涛,陈忠,陈晓荣;复杂网络理论及其应用研究概述[J];系统工程;2005年06期

相关博士学位论文 前1条

1 李敏;蛋白质网络中复合物和功能模块挖掘算法研究[D];中南大学;2008年

相关硕士学位论文 前1条

1 张睿;基于点聚集系数和边聚集系数的社区发现算法[D];云南大学;2013年



本文编号:1773165

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/yysx/1773165.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户26046***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com