基于质谱血清多肽组谱图的管理分析系统构建与应用研究
[Abstract]:In the post-genome era, great progress has been made in basic research and clinical application of proteomics with the completion of human and other model organism genome sequencing and important breakthroughs in mass spectrometry instruments and methods. Clinical proteomics involves a variety of data types. Serum polypeptide profiles (hemopeptide profiles) are one of the most important, and are based on non-gel systems in clinical proteomics applications. The basic principles of these proteomics are universal. Detection of blood by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF/MS) or surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF/MS) The exact mass of the polypeptide group in the serum is then processed using bioinformatics. By comparing the differences between the peptide maps of disease and healthy controls, one can discover disease-specific proteins or polypeptides, and thus help to study the pathogenesis of disease at the protein level.
Hemopeptide mapping has broad application prospects in the fields of biomarker discovery, early diagnosis and individualized treatment. However, the following factors must be taken into account in the application of hemopeptide mapping in clinical research. First, the influence of sample selection on hemopeptide mapping is important for the patients and patients who need to be collected by clinical research institutes. Normal control group samples should take into account individual differences and individual differences. Individual differences between normal control groups include age, sex, race, family history and disease history. Disease patients'samples should contain complete disease subtypes, and the collected information should be as complete as possible to meet the needs of building mathematical models and validation. Secondly, the impact of sample collection on hemopeptide mapping technology is pre-analysis differences, including sample collection, storage and transportation due to environmental differences in the impact of samples, as these differences are generally not related to disease, may increase the complexity of finding disease-related differences in proteins or peptides, and ultimately affect blood Finally, the influence of instrumental analysis on hemopeptide mapping technology is discussed. MALDI-TOF/MS and SELDI-TOF/MS are the main mass spectrometers needed for hemopeptide mapping technology. Due to various factors in the process of mass spectrometry experiment, the original spectrum data produced by mass spectrometry contains a large number of noise signals, which must be pre-processed to remove. Interference.
In view of the characteristics of large number of variables and samples in hemopeptide map, facing such complex data, only through bioinformatics method can we identify a group of peptide peaks closely related to disease and discover the characteristic information related to disease in hemopeptide map. However, the existing data management and analysis tools can not meet the needs of the disease. For this reason, we combine clinical proteomics with bioinformatics to develop a management and analysis system based on mass spectrometry serum peptide profiles, BioSunMS. This system is based on ECLIPSE plug-in architecture, using JAVA. Language development has the characteristics of easy release and secondary development, friendly interface, cross-system platform, easy to manage clinical samples, mass spectrogram and mass spectrogram pretreatment and modeling analysis, so as to facilitate the relevant researchers to carry out disease classification and typing research conveniently and quickly. Finally, we based on lung cancer patients peptide. The sample classification and typing research of the graph illustrate the function of BioSunMS as an example.
1. blood peptide map database construction
Serum peptide profiles, samples and clinical information of normal persons and patients with various tumors (including lung cancer, liver cancer, breast cancer, rectal cancer, prostate cancer, leukemia, etc.) are stored in the hemopeptidase database. The database mainly contains sample sources, diagnostic methods, sample processing, mass spectrometry detection methods, and serum peptide mass spectrometry numbers. The database mainly provides the following important functions: serum peptide map inquiry, through the system, users can obtain the marker spectrum peaks of specific tumors and corresponding peptide sequences; various diseases blood peptide map data submission, through this system, researchers can collect disease blood peptide map data in their laboratory, Submitted to this database, thus enriching the types of diseases in the database; analysis of the disease information of the blood peptide map, the detection personnel will be directly obtained by the clinical blood peptide map query through this database, thus obtaining disease-related information.
Software development of data processing and analysis of blood peptide map 2.
In order to rapidly and accurately carry out tumor classification and typing research based on hemopeptide map data, a data processing and analysis module of hemopeptide map was developed. After statistical analysis of the processed data, the characteristic peaks can be found, the blood peptide map model can be established, and the blind samples can be discriminated.
3. tumor classification and typing based on blood peptide map data
With support vector machine (SVM), principal component analysis (PCA), genetic algorithm (GA), Na? Ve Bayes, partial least squares (PLS) and other commonly used statistical and machine learning methods as tools, tumor classification and typing module based on blood peptide map data was constructed, and model parameters were provided. The optimization function is convenient for relevant personnel to carry out tumor classification and typing research.
Establishment of 4. tumor characteristic blood peptide map model
The study was carried out in collaboration with the National Center for Instrumental Analysis (NIAA). In the previous work, NIAA has completed the collection of high resolution mass spectrometry (HRMS) data from 1000 healthy people and 2000 patients with lung cancer, liver cancer, breast cancer, rectal cancer, prostate cancer and leukemia. The blood peptide maps of 254 lung cancer patients and 257 normal controls were analyzed in the database. Firstly, we constructed the training set from the blood peptide maps of 150 lung cancer patients and 150 control samples. The rest 104 lung cancer patients and 107 normal control samples were used to construct the test set. Seventy-four characteristic peaks were screened out according to the standard of P 0.005. Based on these variables, we constructed the classification model of lung cancer hemopeptide map by SVM and validated it by test set. For test set, the accuracy, sensitivity and specificity of classification were 92.3%, 96.3% and 94.3% respectively. Based on the information of spectral peaks, an early diagnosis model of lung cancer based on mass spectrometric hemopeptide map was constructed, and the early diagnosis of lung cancer was preliminarily explored.
To sum up, a software named BioSunMS, which integrates the database management and analysis of serum peptide profiles of mass spectrometry, was constructed. The system was used to analyze the data of lung cancer hemopeptide profiles, and the early diagnosis model of lung cancer hemopeptide profiles was constructed, which provided bioinformatics support for the related research based on mass spectrometry hemopeptide profiles.
【学位授予单位】:中国人民解放军军事医学科学院
【学位级别】:博士
【学位授予年份】:2009
【分类号】:R346
【相似文献】
相关期刊论文 前10条
1 李向阳;张嘉保;何永聚;王景龙;;CD4~+T细胞表位预测及其应用[J];安徽农业科学;2011年17期
2 毛向明;邢荣威;景晓玮;周其赵;余庆锋;郭文彬;武小强;褚庆军;冯春琼;;弱精子症相关基因的生物信息学研究[J];中华男科学杂志;2011年08期
3 沈霞;谭亚芳;刘清;;金银花中绿原酸及其异构体三维结构的生物信息学研究[J];陕西中医;2011年07期
4 孙红;殷作群;孙妍;丁瑜;;生物信息学在医药学领域中的应用[J];医学信息(上旬刊);2011年09期
5 丁克祥;董萍;韩晋云;杨永鹏;丁宇;丁振华;;神经肽Y及其受体的生物信息学和医学生理学的研究[J];国际老年医学杂志;2010年03期
6 付芹芹;荆春霞;杨光;郭志云;孙小会;王穗湘;李月琴;周天鸿;;微小隐孢子虫腺苷酸激酶基因克隆及分析[J];中国公共卫生;2011年07期
7 李江域;赵东升;王玉民;;GPU计算及其在生物医学研究中的应用[J];军事医学;2011年08期
8 郑辉;黄志刚;闻人庆;李洪义;;眼皮肤白化病患者酪氨酸酶基因突变的研究[J];中国应用生理学杂志;2011年03期
9 朱文楠;习杨;吕湘;刘德培;;人类miRNA上游转录因子及下游靶基因的基因本体分析[J];中国微生态学杂志;2011年08期
10 崔颖;王芳;苏建忠;刘洪波;张岩;史庆春;;医学院校生物信息学专业《数据库原理与技术》教学方法研究与实践[J];数理医药学杂志;2011年04期
相关会议论文 前10条
1 李媛;崔尚金;李建伟;于康震;;分子生态学与生物信息学[A];中国畜牧兽医学会禽病学分会第十一次学术研讨会论文集[C];2002年
2 陆文聪;钮冰;;基于数据挖掘的生物信息学研究进展[A];中国化学会第27届学术年会第15分会场摘要集[C];2010年
3 陈婷婷;郭婷婷;李林;安冬;;基于生物信息学的功能蛋白基因序列分类研究[A];2011年全国通信安全学术会议论文集[C];2011年
4 卢学春;杨波;朱宏丽;姚善谦;;采用生物信息学方法优化依硫磷酸联合方案治疗MDS的应用研究[A];中国科协海峡两岸学术研讨会——2008血液肿瘤论坛会议会编[C];2008年
5 阮林;何颖;邹泽红;傅意玲;陈惠芳;陶爱林;;外源蛋白过敏原性生物信息学评价[A];中华医学会2010年全国变态反应学术会议暨中欧变态反应高峰论坛参会指南/论文汇编[C];2010年
6 冯文龙;赵清杰;;基于遗传算法的DNA多序列比对问题[A];2007年中国智能自动化会议论文集[C];2007年
7 康晓东;;生物信息学及其研究对象[A];2003年全国医学影像技术学术会议论文汇编[C];2003年
8 王智宇;童强松;曾甫清;刘媛;顾朝辉;郑丽端;蔡嘉斌;蒋国松;;小鼠睾丸特异性基因TSEG-4的克隆及表达分析[A];第十五届全国泌尿外科学术会议论文集[C];2008年
9 朱云平;刘湘军;魏丽萍;李亦学;;肝脏蛋白质组的生物信息学研究[A];中国蛋白质组学第三届学术大会论文摘要[C];2005年
10 孙琳琳;蒋继志;;生物信息学及其在作物抗性基因研究中的应用[A];中国植物病理学会2006年学术年会论文集[C];2006年
相关重要报纸文章 前10条
1 衣晓峰 乔蕤琳;哈医大建立系列生物信息学研究方法[N];中国医药报;2010年
2 记者 郭晓静 通讯员 熊学莉;三医大建起生物信息学数据库[N];重庆日报;2003年
3 本报记者 白毅;生物信息学院士谈[N];中国医药报;2002年
4 中科院生物学部 张春霆;对生物信息学的展望[N];北京科技报;2000年
5 中科院院士 吴e,
本文编号:2184447
本文链接:https://www.wllwen.com/yixuelunwen/shiyanyixue/2184447.html