蛋白质结构功能预测中若干关键算法的分析比较
[Abstract]:With the rapid development of sequencing technology, the gap between the number of protein sequences and the number of proteins with measured structure and function is increasing. Therefore, it is urgent to realize the prediction of protein structure and function by theoretical calculation. At present, many effective methods have been proposed to study the relationship among protein sequence, structure and function, but different methods have a preference in solving the problem of protein structure and function. Therefore, this paper mainly focuses on the research methods of protein structure and function, and systematically compares and analyzes different feature extraction methods, feature selection methods and prediction algorithms in protein structure class, protein disorder, protein molecular chaperone. Protein solubility and RNA binding protein prediction efficiency. The main research contents are as follows: 1. The background and significance of protein research, the composition, structure and physicochemical properties of protein are briefly introduced, and the commonly used databases and the data sets used in this paper are briefly introduced. 2. The methods of amino acid reduction and feature extraction in the prediction of protein structure and function were analyzed and compared. According to the properties of 522 amino acids, 20 kinds of amino acids were reduced to k class, and 6 kinds of different information of protein were extracted. The efficiency of amino acid reduction and information extraction methods in predicting protein structure and function was compared and analyzed with support vector machine (SVM). The results showed that in the prediction of protein structure and protein molecular chaperone, it was better to use the conversion tendency of amino acids to reduce 20 kinds of amino acids, and then extract the sequence characteristics of proteins. However, the prediction of protein solubility tends to the RCTD feature extraction method of protein. 3. The feature selection method in protein structure and function prediction is analyzed and compared. In this chapter, 16 feature selection methods based on mutual information and support vector machine are selected, and the efficiency of feature selection in protein structure and function prediction is compared with K-nearest neighbor prediction algorithm. The results show that the feature selection method based on nonlinear support vector machine performs best in protein structure prediction, protein solubility prediction, protein molecular chaperone prediction and protein solubility prediction. The accuracy of the selected features is improved by 13.16- 71, especially the k-mer features and PSSM features of proteins. 4. The prediction algorithms in the prediction of protein structure and function are analyzed and compared. In this chapter, seven prediction algorithms, such as linear discriminant analysis (LDA) and principal component analysis (PCA), are selected, and the efficiency of different prediction algorithms in protein structure and function is compared and analyzed. The results show that the SVM prediction algorithm is the best in protein structure class prediction, especially combined with protein PRseAAC features, and the prediction accuracy is 99.15. The molecular chaperones of proteins can be predicted accurately by choosing PCADA,CART,PLSDA,KNN or SVM algorithms, and in the prediction of protein disturbance, the combination of KNN prediction algorithm with protein RCTD features is the best, and the accuracy is 94.75%. In predicting protein solubility, PSSM features should be selected, combined with PLSDA and PCADA prediction algorithms, while the combination of GO feature and CART algorithm or GO feature and PLSDA algorithm can obtain better prediction accuracy when predicting RNA bound proteins.
【学位授予单位】:浙江理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q51
【相似文献】
相关期刊论文 前10条
1 陈功;周小科;;基于机器学习的miRNA靶基因预测算法研究概况[J];软件导刊;2011年12期
2 江礼俐;唐晓峰;唐国安;;结构中不可测区域振动响应的预测算法[J];上海航天;2006年02期
3 聂书志;叶邦彦;;大规模数据环境下用电量预测算法研究[J];科技通报;2013年02期
4 徐军,向健华,粱昌洪;最大化背景模型用于检测红外图像中的弱小目标[J];光子学报;2002年12期
5 李志俊;蔡黎;宋业新;张洁;;一种灰色拓扑改进预测算法及应用研究[J];长江大学学报(自科版)理工卷;2007年02期
6 徐海松,叶关荣;计算机自动配色预测算法研究[J];光学学报;1996年11期
7 刘平;马玉韬;孙学宏;张成;杜勇;;基因预测算法中阈值的傅里叶质谱分析[J];湖北农业科学;2014年06期
8 王果;骆晓艳;胡志波;陈素;;基于时序的股票预测算法研究[J];江苏技术师范学院学报;2010年06期
9 潘矜矜;戴宪华;杨小劲;;一种基于卡尔曼滤波修正的LRP信道预测算法[J];桂林工学院学报;2008年02期
10 王洪,冯嘉礼;基于属性论方法的股市预测算法[J];复旦学报(自然科学版);2004年05期
相关会议论文 前10条
1 朱斌;樊祥;马东辉;程正东;;窗口大小和权值模板对固定权值背景预测算法的影响[A];2006年全国光电技术学术交流会会议文集(D 光电信息处理技术专题)[C];2006年
2 王峰;姬冰辉;李斗;;一种基于混沌理论的自相似业务流预测算法研究[A];2006北京地区高校研究生学术交流会——通信与信息技术会议论文集(上)[C];2006年
3 钱正祥;徐华;张申浩;;数字信号序列的向量预测算法[A];第三届全国信息获取与处理学术会议论文集[C];2005年
4 郭景峰;代军丽;马鑫;王娟;;针对通信社会网络的时间序列链接预测算法[A];第26届中国数据库学术会议论文集(A辑)[C];2009年
5 张利萍;李宏光;;改进的灰色预测算法在工业应用中的评价[A];第二届全国信息获取与处理学术会议论文集[C];2004年
6 崔冬;;一种改进的LRP信道预测算法[A];2006通信理论与技术新进展——第十一届全国青年通信学术会议论文集[C];2006年
7 王佳;殷海兵;周冰倩;;一种适合硬件实现的低复杂度MAD预测算法[A];浙江省电子学会2011学术年会论文集[C];2011年
8 郑铭浩;刘志红;巫瑞波;徐峻;;P450各亚型代谢调控剂预测算法[A];中国化学会第28届学术年会第14分会场摘要集[C];2012年
9 张晓丹;王萍;;一种基于特征的H.264的子块快速帧内预测算法[A];第七届和谐人机环境联合学术会议(HHME2011)论文集【oral】[C];2011年
10 刘志红;郑铭浩;严鑫;巫瑞波;徐峻;;基于结构的化合物稳定性预测算法[A];中国化学会第28届学术年会第14分会场摘要集[C];2012年
相关博士学位论文 前2条
1 马玉韬;基于滤波理论和特征统计的蛋白质编码区预测算法研究[D];天津大学;2013年
2 玄萍;MicroRNA识别及其与疾病关联的预测算法研究[D];哈尔滨工业大学;2012年
相关硕士学位论文 前10条
1 吴智勇;学术论文排序预测算法研究[D];内蒙古大学;2015年
2 张勇攀;针对残缺IP网络的链路预测技术研究[D];哈尔滨工业大学;2015年
3 应超;博物馆移动导览中的远程展示技术研究及系统实现[D];浙江大学;2015年
4 常艳华;基于数据驱动模拟电路故障预测算法实现与软件开发[D];电子科技大学;2015年
5 闫青;基于预测算法的快速多尺度金字塔时空特征点计算算法研究[D];青岛科技大学;2016年
6 钱吕见;复杂网络中基于角色传递性和对称性的链接预测算法研究[D];兰州大学;2016年
7 李小科;无模型自适应预测算法及其在非线性过程控制中的应用[D];兰州大学;2016年
8 周攀;基于姿态传感器的人体步态预测算法设计与实现[D];西南交通大学;2016年
9 周真争;基于社团综合属性的链路预测算法研究[D];南京信息工程大学;2016年
10 任程;DSP+FPGA平台功耗管理的研究与实现[D];哈尔滨工业大学;2016年
,本文编号:2427636
本文链接:https://www.wllwen.com/shoufeilunwen/benkebiyelunwen/2427636.html