医疗体检数据预处理方法研究
发布时间:2018-04-27 09:08
本文选题:体检数据 + 预处理 ; 参考:《计算机应用研究》2017年04期
【摘要】:原始体检数据存在信息模糊、有噪声、不完整和冗余的问题,无法直接用于疾病的风险评估与预测。由于体检数据在结构和格式等方面的不足,不适合采用传统的数据预处理方法。为了充分挖掘体检数据中有价值的信息,从多角度提出了针对体检数据的预处理方法:通过基于压缩方法的数据归约,降低了体检数据预处理的时间及空间复杂度;通过基于分词和权值的字段匹配算法,完成了体检数据的清洗,解决了体检数据不一致的问题;通过基于线性函数的数据变换,实现了历年体检数据的一致性和连续性。实验结果表明,基于分词和权值的字段匹配算法,相对于传统算法具有更高的准确性。
[Abstract]:The original physical examination data has some problems such as fuzzy information, noise, incomplete and redundancy, which can not be directly used for disease risk assessment and prediction. Because of the deficiency of the structure and format of the physical examination data, it is not suitable to adopt the traditional data preprocessing method. In order to fully mine the valuable information in the physical examination data, the preprocessing method for the physical examination data is put forward from many angles: by reducing the data based on the compression method, the time and space complexity of the medical examination data preprocessing are reduced; Through the field matching algorithm based on word segmentation and weight value, the cleaning of medical examination data is completed, and the problem of inconsistent medical examination data is solved, and the consistency and continuity of medical examination data over the years are realized through the data transformation based on linear function. The experimental results show that the field matching algorithm based on word segmentation and weight is more accurate than the traditional algorithm.
【作者单位】: 郑州大学互联网医疗与健康服务河南省协同创新中心;郑州大学软件与应用科技学院;郑州大学信息工程学院;
【基金】:河南省重点科技攻关项目(152102210249)
【分类号】:TP311.13
,
本文编号:1810115
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1810115.html