基于关联性的动态分类模型——以皮肤与体质为例
发布时间:2018-05-24 16:03
本文选题:关联性 + 信息融合 ; 参考:《工程科学与技术》2017年03期
【摘要】:针对人体面部皮肤状态指标与中医体质类型之间的关联性进行科学、定量研究,从测试数据持续累积与知识发现深入推进的过程视角,尝试揭示人体内在中医体质与外观皮肤状态指标间的复杂动态演化规律。综合小样本条件下决策树的良好归纳特性及大样本条件下贝叶斯算法分类准确率高的优势。提出基于建模数据量会不断增多的趋势,构建可自适应修订决策树和模糊朴素贝叶斯融合分类算法的权重,以适用于测试数据从小到大积累过程中分类模型均具有较好分类特性及可解释性的应用要求。其中决策树采用最佳后剪枝方式,避免了常规决策树存在的过拟合弊端;朴素贝叶斯算法则通过定义指标归属区间的模糊隶属度来解决皮肤属性测试与分类中存在的随机性与模糊性。实证结果表明本文提出的分类模型的融合权重可动态调整且随着建模数据的增多分类精度会相应提高。目前对应151个建模数据的分类模型的分类准确率为86.7%,高于独立决策树、朴素贝叶斯的83.3%和80%,亦高于对照组80个建模数据对应分类准确率的76.7%。分析可得:此皮肤与体质动态分类模型通过有效利用参与建模的数据信息,能识别出人体面部外观皮肤状态指标与内在中医体质之间的复杂关联性,建立的分类模型具有较好的精度与可解释性,为基于数据驱动的中医理论的科学化、智能化发展进行了有益的探索。
[Abstract]:In view of the relationship between human facial skin state index and TCM physique type, a scientific and quantitative study was carried out from the perspective of continuous accumulation of test data and further promotion of knowledge discovery. This paper attempts to reveal the complex dynamic evolution law of human body between TCM physique and appearance skin state index. The good inductive property of decision tree under small sample condition and the advantage of high classification accuracy of Bayesian algorithm under large sample condition are synthesized. Based on the trend that the amount of modeling data will increase, the weight of adaptive revisable decision tree and fuzzy naive Bayes fusion classification algorithm is constructed. In order to apply to the process of test data accumulation from small to large, the classification model has better classification characteristics and interpretable application requirements. The decision tree adopts the best post-pruning method to avoid the over-fitting drawback of the conventional decision tree. Naive Bayesian algorithm solves the randomness and fuzziness of skin attribute testing and classification by defining the fuzzy membership degree of index attribution interval. The empirical results show that the fusion weight of the proposed classification model can be dynamically adjusted and the classification accuracy will be improved with the increase of modeling data. At present, the classification accuracy of the classification model corresponding to 151 modeling data is 86.7, which is higher than that of independent decision tree, 83.3% and 80% of naive Bayes, and 76.7% of the corresponding classification accuracy of 80 modeling data in the control group. The analysis shows that the dynamic classification model of skin and physique can identify the complex relationship between the skin state index of human face and TCM physique by effectively using the data information involved in the modeling. The established classification model has good precision and interpretability, which is a useful exploration for the scientific and intelligent development of data-driven TCM theory.
【作者单位】: 北京工商大学计算机与信息工程学院食品安全大数据技术北京市重点实验室;北京工商大学中国化妆品研究中心;
【基金】:北京市教育委员会科技发展计划重点项目(KZ201510011011) 北京工商大学促进人才培养综合改革项目(19005428069/007);北京工商大学研究生创新基金
【分类号】:R229;TP181
,
本文编号:1929699
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/1929699.html