多标签学习应用于中医诊断帕金森中类别不均衡问题研究
发布时间:2018-06-07 01:51
本文选题:多标签分类 + 多标签类别不均衡 ; 参考:《南京大学》2016年硕士论文
【摘要】:帕金森病(Parkinson's Disease, PD)是一种在中老年人中常见的慢性中枢神经系统变性疾病。中医对帕金森病的研究源远流长,对帕金森的证型也是众说风云。结合多年的中医诊治经验,现代中医确定了帕金森病的五种证型,并认为帕金森患者最多同时伴有具有主次之分的两个证型。为了规范化帕金森病的中医诊断过程,现代中医提出了涵盖帕金森病相关临床症状的帕金森中医量表。对于如何从量表中的症状推断出具体的证型,中医界依然无法达成共识,诊断仍以经验为主。本文将多标签学习运用到中医诊断帕金森过程中,对证型进行主次分离,利用多标签算法发掘症状与证型中潜藏的相互关系,试图为中医诊断过程提供辅助决策。本文主要工作:1).针对将多标签应用于中医诊断帕金森领域,量表的症状作为特征属性,主次分离后的证型作为标签。根据次证的稀疏性,介绍了帕金森数据集中存在的较为严重的多标签类别不均衡问题。2).针对多标签不均衡中小类样本缺乏数据表示的问题,基于贡献度样本的区分以及异常数据样本过滤的思想,提出了一种适应型小类样本合成算法。算法从数据层面上很好的解决了多标签类别不均衡问题,相比于已有的多标签重采样算法获得了更好的实验结果。3).针对标签相关性对多标签不均衡的影响,基于标签子集构建以及欠采样集成的思想,提出了基于标签子集样本欠采样集成算法。实验结果表明算法相比于已有的多标签算法,在帕金森数据集以及多个公共数据集上能够更好的解决不均衡现象。
[Abstract]:Parkinson's disease (PD) is a common chronic central nervous system degeneration in the elderly. The research on Parkinson's disease in TCM has a long history, and the syndrome type of Parkinson's disease is also popular. Combined with many years of experience in the diagnosis and treatment of Chinese medicine, modern Chinese medicine has determined the five types of Parkinson's disease, and thinks that Parkinson's patients are accompanied by two syndromes with primary and secondary types at most. In order to standardize the diagnosis of Parkinson's disease (PD), a Chinese medicine scale (TCM), which covers the clinical symptoms of Parkinson's disease (PD), has been proposed by modern Chinese medicine (TCM). There is still no consensus on how to deduce the specific syndromes from the symptoms of the scale, and the diagnosis is still based on experience. In this paper, multi-label learning is applied to the diagnosis of Parkinson's disease in traditional Chinese medicine (TCM), the main and secondary syndromes are separated, and the interrelation between symptoms and syndromes is explored by using multi-label algorithm, in order to provide auxiliary decision for the diagnosis process of TCM. The main work of this paper is 1: 1. In view of the application of multiple labels in the field of diagnosis of Parkinson's disease in traditional Chinese medicine, the symptom of the scale is regarded as the characteristic attribute, and the syndrome type after primary and secondary separation is used as the label. According to the sparsity of secondary syndromes, this paper introduces the serious multi-label class imbalance problem in Parkinson's dataset. In order to solve the problem of lack of data representation for small class samples in multi-label disequilibrium, an adaptive small class sample synthesis algorithm is proposed based on the distinction of contribution samples and the idea of abnormal data sample filtering. The algorithm solves the problem of multi-label class imbalance from the data level, and gets better experimental results than the existing multi-label resampling algorithm. Based on the idea of label subset construction and under-sampling integration, a sample under-sampling ensemble algorithm based on label subset is proposed for the influence of label correlation on multi-label imbalance. The experimental results show that compared with the existing multi-label algorithm, the algorithm can solve the imbalance better in Parkinson's datasets and common datasets.
【学位授予单位】:南京大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP301.6;R277.7
【参考文献】
相关期刊论文 前3条
1 李军艳;杨明会;赵冠英;;试论肾虚血瘀是帕金森病的基本病机[J];中华中医药杂志;2008年09期
2 何梅光;段晓荣;张沛霖;;张沛霖老师针灸治疗震颤麻痹经验[J];针灸临床杂志;2006年11期
3 宋秋云;帕金森病中医证治体会[J];河南中医;2003年03期
,本文编号:1989161
本文链接:https://www.wllwen.com/zhongyixuelunwen/1989161.html
最近更新
教材专著