当前位置:主页 > 硕博论文 > 医学硕士论文 >

甲状腺结节诊断分类方法的研究与实现

发布时间:2018-04-09 05:35

  本文选题:集成学习 切入点:甲状腺结节 出处:《东华大学》2017年硕士论文


【摘要】:甲状腺疾病是内分泌科中一类常见的疾病,主要表现为甲亢、甲减、甲状腺炎、甲状腺结节等。其中甲状腺结节是对人体健康危害较为严重的一种,并且发病率有逐年上升的趋势。甲状腺结节患者在就诊过程中留下了大量的电子病历数据,要想改善甲状腺结节临床诊断的现状,需要我们高效、准确地挖掘出这些数据中隐含的信息。在传统的甲状腺结节临床诊断过程中,医生需要对患者进行超声、血检、细针穿刺等检查,才能初步判断患者的良恶性属性,但即使这样,诊断结果的准确率依然不尽人意。另一方面,传统的机器学习算法在对真实医疗数据集进行模型训练及预测时,均体现出较高的误差。究其原因,在于其没有考虑到医疗数据集的特殊性——稀疏性和不平衡性,因此使结果产生较大的偏差。在此背景下,为了减少患者不必要的检查流程,提高甲状腺结节的鉴别准确率与效率,本文提出了一种基于超声检查特征的甲状腺结节鉴别方法,并在已有集成学习的基础模型上做出改进,建立了一个自定义的甲状腺结节鉴别模型,最后设计并实现了一个基于超声检查数据的甲状腺结节辅助鉴别系统。本文首先针对甲状腺结节的临床数据集,从患者基本信息、生化指标和临床诊断等方面进行分析,研究指标之间以及临床诊断之间的相互关系,为甲状腺结节的临床治疗过程提供重要依据。然后对文本形式的甲状腺超声电子病理记录进行结构化处理,提取出有效的、结构化的特征属性,并对其进行平衡化、数值化等必要的预处理,转化为机器学习分类算法所能识别的形式,方便实验过程中的数据分析与建模。最后在已有集成学习的基础模型上,通过在其目标函数中加入自定义项的方式做出适合医疗数据集的改进,构建一个新的鉴别模型,有效解决由于数据集的稀疏性与不平衡性所造成的实验结果的误差,提高预测结果的准确性。同时建立一个基于超声检查数据的甲状腺结节辅助鉴别系统,患者和医生通过输入相应的超声检查特征就能实时预测鉴别结果,实现甲状腺结节的自动化鉴别功能,提高检查的效率。为了验证本文所提出鉴别方法的优越性,实验在真实医疗数据集和UCI标准数据集上分别对比了本算法与随机森林、支持向量机、神经网络算法,结果表明该方法具有最高的准确率,分别达到92.43%和94%。
[Abstract]:Thyroid disease is a common disease in Endocrinology, which is characterized by hyperthyroidism, hypothyroidism, thyroiditis, thyroid nodule and so on.Thyroid nodule is one of the most serious health hazards, and the incidence of thyroid nodule is increasing year by year.Patients with thyroid nodules have left a large number of electronic medical records in the process of treatment. In order to improve the present situation of clinical diagnosis of thyroid nodules, we need to extract the hidden information from these data efficiently and accurately.In the traditional clinical diagnosis of thyroid nodule, doctors need ultrasound, blood examination, fine needle puncture and other examinations to preliminarily judge the benign and malignant properties of patients, but even so, the accuracy of diagnosis is still unsatisfactory.On the other hand, the traditional machine learning algorithms have higher errors in model training and prediction of real medical data sets.The reason is that it does not take into account the particularity of medical data set-sparsity and unbalance, so the result is deviated greatly.In this context, in order to reduce the unnecessary examination process and improve the accuracy and efficiency of thyroid nodule differentiation, this paper proposes a method based on ultrasonic features to distinguish thyroid nodules.Based on the existing integrated learning model, a self-defined thyroid nodule identification model is established. Finally, a thyroid nodule identification system based on ultrasonic data is designed and implemented.Based on the clinical data set of thyroid nodules, this paper analyzes the basic information, biochemical indicators and clinical diagnosis of thyroid nodules, and studies the relationship between the indicators and the clinical diagnosis.To provide an important basis for the clinical treatment of thyroid nodules.Then, the electronic pathological records of thyroid ultrasound in the form of text are processed structurally, and the effective and structured characteristic attributes are extracted, and the necessary preprocessing, such as balancing and numerical processing, is carried out.It can be transformed into a form that can be recognized by machine learning classification algorithm, which is convenient for data analysis and modeling in the process of experiment.Finally, based on the existing integrated learning model, a new discriminant model is constructed by adding a custom item to the objective function to improve the medical data set.The errors of experimental results caused by the sparsity and unbalance of data sets are solved effectively, and the accuracy of prediction results is improved.At the same time, a thyroid nodule identification system based on ultrasonic examination data is established. Patients and doctors can predict the results of thyroid nodules by input the corresponding ultrasonic features in real time, and realize the automatic identification function of thyroid nodules.Improve the efficiency of inspection.In order to verify the superiority of the method proposed in this paper, we compare the proposed algorithm with random forest, support vector machine and neural network algorithm on the real medical data set and UCI standard data set, respectively.The results show that this method has the highest accuracy, reaching 92.43% and 94% respectively.
【学位授予单位】:东华大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:R581;TP311.52

【参考文献】

相关期刊论文 前10条

1 陈德华;冯洁莹;乐嘉锦;潘乔;;中文病理文本的结构化处理方法研究[J];计算机科学;2016年10期

2 邹博伟;钱忠;陈站成;朱巧明;周国栋;;面向自然语言文本的否定性与不确定性信息抽取[J];软件学报;2016年02期

3 Wang Longkang;Ren Tingxiang;Nie Baisheng;Chen Yang;Lv Changqing;Tang Haoyang;Zhang Jufeng;;Development of a spontaneous combustion TARPs system based on BP neural network[J];International Journal of Mining Science and Technology;2015年05期

4 李前程;孙丽娜;吴双;隋国庆;王辉;;高频超声及弹性成像对甲状腺结节性质鉴别诊断的logistic回归分析[J];中国地方病防治杂志;2015年04期

5 熊伟;龚勋;罗俊;李天瑞;;基于局部纹理特征的超声甲状腺结节良恶性识别[J];数据采集与处理;2015年01期

6 张振;周毅;杜守洪;罗雪琼;梅甜;;医疗大数据及其面临的机遇与挑战[J];医学信息学杂志;2014年06期

7 赵杰;祁永梅;;一种新的甲状腺肿瘤超声图像特征提取算法[J];光电工程;2013年09期

8 吴伟成;周俊生;曲维光;;基于统计学习模型的句法分析方法综述[J];中文信息学报;2013年03期

9 付忠良;;通用集成学习算法的构造[J];计算机研究与发展;2013年04期

10 冯时;付永陈;阳锋;王大玲;张一飞;;基于依存句法的博文情感倾向分析研究[J];计算机研究与发展;2012年11期



本文编号:1725108

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/mpalunwen/1725108.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户7360e***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com