军队医疗服务大数据交互式统计分析关键技术研究
本文选题:医疗服务 + 大数据 ; 参考:《中国人民解放军军事医学科学院》2016年博士论文
【摘要】:近年来,随着计算机信息化手段的广泛运用,军队卫生统计工作信息化水平不断提高,通过构建卫生统计门户网站,为总部首长提供卫生统计查询服务,在数据利用方面取得了巨大的进步。但是,目前的统计方法和系统还存在统计指标不够完善、统计粒度不够细、交互式查询响应速度慢等问题,对辅助决策支撑能力不足。现阶段,我军已初步实现全军医疗服务信息的自动抓取,仅结构化数据每年的抓取量达数百亿条记录,军队卫生统计工作已经进入了大数据时代。而目前的统计流程和软件,需要约一周时间进行年度统计会审,难以满足实际需求。为此,原总后卫生部启动了“军队卫生统计创新工程”作为“十二五”全军卫生信息化建设的重点工作,大数据统计处理方法和技术是其中的重要支撑。实现军队医疗服务大数据的交互式统计分析,能够基于海量原始医疗数据提供以“天”为单位的细粒度统计模式,为总部机关卫勤决策提供数据支持,从而及时掌握医疗资源的分布和利用情况,快速应对和处置公共突发卫生事件,以及加强对医疗服务机构的指导、管理和监督。同时,也可以为军队、国家的卫生统计系统和区域医疗平台的建设提供普适性的方法论指导,为构建全军医疗大数据服务平台提供技术支撑,从而促进卫勤管理保障从粗放型到精细型的模式创新。本文运用文献研究法、对比分析法、专家咨询法、系统分析法、调查法、实证研究法等研究方法,分析了军内外卫生统计的发展现状,对相关理论及概念、军队医疗服务大数据的来源范畴、数据特征进行了定义和归纳总结,构建了军队卫生统计指标体系框架,围绕大数据时代下的军队医疗服务数据统计、分析及利用的功能和性能需求,针对全军卫生信息中心采用“数据直报”系统从全军200余家中心医院抽取的大样本分布式、同构、结构化、复杂关联的数据进行交互式统计的处理方法和步骤进行了梳理总结,并提出了一套基于Spark的并行计算解决方案,对数据预处理、分布式存储、交互式智能统计和多维可视化等功能模块所需的关键技术进行了技术选型,完成了军队医疗服务大数据交互式分析平台系统的架构设计,以Spark计算平台为基础进行了系统原型的实现,并在此基础上使用不同数据规模的6个测试数据集和8个节点规模的Spark集群对原型系统的功能和性能进行了对比和验证。1.勤务需求分析从卫勤保障的勤务需求出发,分析基于医疗服务大数据的统计分析平台需具备的功能指标和性能指标。一是对军队医疗服务数据统计的相关概念、基础理论和国内外研究发展与现状进行了研究,将其归纳为“大样本复杂关联数据”;二是系统分析了医疗服务大数据的来源、范畴及特征;三是从业务角度对现有军队卫生统计指标进行归类整理,构建出了包含业务领域、业务主题、统计目的、统计维度和分析指标等5个层次的军队卫生统计指标体系框架,并对医疗服务业务领域中的门诊、住院等业务主题进行了细化;四是提出了交互式统计平台的功能及性能需求。2.交互式统计关键技术选型在勤务需求分析的基础上,分析医疗服务大数据交互式统计平台的数据通用处理流程,确定需要分布式存储、NoSQL数据库、通用大数据处理平台和大数据可视化Web框架等关键技术,对各类技术的优缺点进行对比分析,借鉴其在互联网、金融、电商及医疗服务行业中的具体应用,结合医疗服务大数据的特点,选取适用于交互式统计分析的技术组合,即选用Sqoop为医疗服务数据提供支持增量更新的ETL服务,HDFS和HBase为医疗服务大数据和其计算结果集提供存储服务,Spark计算框架提供交互式、高效的并行计算服务,Web2py提供多维可视化展示。3.医疗服务大数据交互式统计平台系统设计通过对医疗服务大数据交互式统计分析平台建设目标的梳理对平台进行架构设计,将体系结构在功能上划分为外部数据接入和存储、多范式数据分析和提取、交互查询和数据展示三个基本模块。从数据预处理和存储、高效并行计算服务和可视化展示三方面分别设计相应的体系结构和算法。4.系统原型实现及验证应用前面部分的研究成果,指导系统原型设计、开发环境选择和部署运行,以Spark计算平台为基础对设计的医疗服务大数据交互式分析平台进行了系统原型的实现,验证了系统的功能。在此基础上,以门诊流程所涉及到的相关数据表为例,使用线性增长的6个不同大小的测试数据集和8个节点的Spark集群对系统的功能和性能进行了对比测试验证。测试的计算类型包括简单分组规约、求和规约和多表连接等统计过程中的代表性操作。利用支持增量更新的数据ETL工具Sqoop、分布式文件系统HDFS、分布式数据库HBase、基于内存计算的Spark框架和简单高效的Web2py可视化展示平台等大数据技术组合,开发的军队医疗服务大数据交互式统计分析平台系统原型能够支持亿级记录以上医疗服务数据规模的交互式统计查询,在满足数据预处理、存储、计算和可视化功能的前提下,任务处理效率能够随着硬件节点资源的增加得到近乎线性的提升。本研究是大数据处理技术在医疗服务大数据交互式统计分析中的有益探索和成功尝试,为建设全军范围内的卫生信息统计平台以及医疗服务大数据的进一步挖掘和利用提供了第一手的实践资料。
[Abstract]:In recent years, with the extensive use of computer information technology, the information level of military health statistics has been improved continuously. Through the construction of health statistics portal, it provides the head head with health statistics inquiry service, and has made great progress in the use of data. However, the statistical methods and systems still have statistical indicators. In the present stage, our army has preliminarily realized the automatic grasping of the medical service information of the whole army, and the volume of structured data has reached hundreds of billions of records every year, and the military health statistics work has entered the era of big data. The statistical process and software need about a week to carry out the annual statistical review, which is difficult to meet the actual demand. Therefore, the former Ministry of health started the "army health statistics innovation project" as the key work of the "12th Five-Year" whole army health information construction, and the major data statistical processing methods and techniques are the important support. The interactive statistical analysis of military medical service data can provide a fine grained statistical model based on the mass original medical data and provide data support for the decision-making of health service in headquarters, so as to timely grasp the distribution and utilization of medical resources, quickly deal with and deal with public emergency health events, and strengthen the public health services. The guidance, management and supervision of medical service institutions can also provide universal methodological guidance for the army, the national health statistics system and the construction of the regional medical platform, and provide technical support for the construction of the whole military medical large data service platform, thus promoting the maintenance of medical service from extensive to fine pattern innovation. By using the methods of literature research, comparative analysis, expert consultation, system analysis, investigation, and empirical research, this paper analyzes the development status of health statistics at home and abroad, defines and summarizes the related theories and concepts, the source category of large military medical service data, and summarizes the data characteristics, and constructs the military health statistics. The framework of the index system is based on the data statistics, analysis and utilization of military medical services in the era of large data, and the large sample distributed, isomorphic, structured and complex data collected by the whole army health information center using "data direct reporting" system from more than 200 central hospitals in the army. The processing methods and steps are summarized, and a set of parallel computing solutions based on Spark is proposed. The key technologies needed for data preprocessing, distributed storage, interactive intelligent statistics and multidimensional visualization are selected, and the interactive analysis platform system of military medical service large data is completed. The architecture design is implemented on the basis of Spark computing platform. On this basis, the function and performance of the prototype system are compared with 6 test data sets of different data scale and the Spark cluster of 8 node scale. The analysis of.1. service requirement analysis is based on the service requirements of the medical service support. The statistical analysis platform for the large data of medical service needs the functional indicators and performance indicators. First, the relevant concepts of military medical service data statistics, basic theory and the development and status of research and development at home and abroad are studied, and it is summed up as "large sample complex association data", and two is a systematic analysis of the source of large data for medical services. Category and characteristics; three is to classify the existing military health statistical indicators from the business point of view, and build a framework of military health statistics index system which includes 5 levels, including business domain, business theme, statistical purpose, statistical dimension and analysis index, and the business topics such as out-patient and hospitalization in medical service business area are carried out. Four is the function and the performance requirement of the interactive statistical platform. The.2. interactive statistical key technology selection is based on the analysis of the service demand. It analyzes the data general processing flow of the interactive Statistical Platform of medical service large data, and determines the need for distributed storage, NoSQL database, general large data processing platform and large data. In view of the key technologies such as Web framework and other key technologies, the advantages and disadvantages of various technologies are compared and analyzed, and the specific applications in the Internet, finance, e-commerce and medical services are used for reference, and combined with the characteristics of the large data of medical services, the technical combination suitable for interactive statistical analysis is selected, that is to choose Sqoop to provide more support for the medical service data. New ETL services, HDFS and HBase provide storage services for medical service large data and its computing result set, Spark computing framework provides interactive, efficient parallel computing services, Web2py provides multidimensional visualization display,.3. medical service large data interactive statistical platform system design through interactive statistical analysis of medical service large data The system structure is divided into external data access and storage, multi paradigm data analysis and extraction, interactive query and data display three basic modules. The corresponding system is designed from three aspects: data preprocessing and storage, efficient parallel computing service and visual display. The structure and algorithm.4. system prototype implements and validates the research results in the front part of the system, directing the system prototype design, developing environment selection and deploying operation. Based on the Spark computing platform, the system prototype is realized and the function of the system is verified. The related data table involved in the diagnosis process is used as an example. Using 6 different test data sets of linear growth and the Spark cluster of 8 nodes, the function and performance of the system are tested and verified. The calculation types of the test include the representative operation in the statistical process, such as the simple packet specification, the request and the protocol and the multi table connection. Support incremental update data ETL tools Sqoop, distributed file system HDFS, distributed database HBase, Spark framework based on memory computing and simple and efficient Web2py visualization display platform and other large data technology combinations, the prototype of interactive statistical analysis platform system for military medical service large data is developed to support more than 100 million records On the premise of meeting the functions of data preprocessing, storage, computing and visualization, the efficiency of task processing can be improved linearly with the increase of hardware node resources. This study is a useful exploration of large data processing technology in the interactive statistical analysis of medical service large data. The first hand is provided for the construction of the health information statistics platform in the whole army and the further mining and utilization of the large data of medical service.
【学位授予单位】:中国人民解放军军事医学科学院
【学位级别】:博士
【学位授予年份】:2016
【分类号】:R82
【相似文献】
相关期刊论文 前10条
1 李怡勇,谢峻,陈文敏;军队医疗设备集中招标采购的实践与特点[J];医疗卫生装备;2001年05期
2 何兴华;关于建立区域性军队医疗设备服务中心的构想[J];医疗卫生装备;2003年S1期
3 曾琦;浅谈军队医疗单位和人员的工作态度和服务方法[J];医疗卫生装备;2003年S1期
4 蔡欣芸;浅谈军队医疗服务深化改革的几个问题[J];解放军医院管理杂志;2005年02期
5 王关,高龙虎,王丽华;军队医疗设备折旧速查表[J];医疗卫生装备;2005年03期
6 安志萍;赵丽萍;敖琼;;军队医疗装备可靠性维修管理模式的探讨[J];医疗卫生装备;2007年05期
7 林瑞娇;李永中;;军队医疗临床护理工作的优化[J];福州总医院学报;2007年04期
8 王峗;黄崇甄;涂岩军;;浅谈军队医疗单位计算机硬件的维护[J];东南国防医药;2009年03期
9 李良安;;军队医疗设备发展策略初探[J];医疗卫生装备;1993年03期
10 周洁;;军队医疗设备集中采购存在的问题与对策[J];人民军医;2013年11期
相关重要报纸文章 前7条
1 记者 胥金章;“军队医疗单位”邮购药品皆骗局[N];新华每日电讯;2006年
2 刘文勇;专家当人梯托起青年才俊[N];解放军报;2007年
3 郭效东;法军将打造“明天的国防”[N];中国国防报;2008年
4 银春林 特约通讯员 史春玉;广州军区广州总医院走出高效服务战斗力建设路子[N];解放军报;2009年
5 张梅珍;“霉女”多脏器发霉之谜被破解[N];中国医药报;2004年
6 向勇 胡阳琼 罗克军 特约记者杨明伟;一个平凡军医的精神高地[N];战士报;2010年
7 章名岂;“关门”危机解除,美军虚惊一场[N];中国国防报;2011年
相关博士学位论文 前1条
1 范炜玮;军队医疗服务大数据交互式统计分析关键技术研究[D];中国人民解放军军事医学科学院;2016年
,本文编号:1798101
本文链接:https://www.wllwen.com/yixuelunwen/yundongyixue/1798101.html