基于Hadoop的MeteCloud资源存储与数据处理的研究
发布时间:2018-01-16 23:34
本文关键词:基于Hadoop的MeteCloud资源存储与数据处理的研究 出处:《南京信息工程大学》2013年硕士论文 论文类型:学位论文
更多相关文章: Hadoop MeteCloud Hive HBase 气象日值数据
【摘要】:目前,气象行业中各级气象部门均拥有独立的业务系统和存储系统,气象资料无法高效地集中管理与资源共享。“云计算”技术的出现和高速发展为这一问题提供了一个解决方案。 本文在分析云平台相关理论模型的基础上,选取中国地面国际交换站气候资料日值数据文件(1951年至2012年)作为研究对象,主要做了如下工作: (1)分析了开源云平台Hadoop的分布式文件系统HDFS的架构、读写数据流程,计算模型MapReduce的数据处理流程,分布式数据库HBase体系结构、创建表格过程以及数据仓库Hive的体系结构、存储和查询数据的过程。 (2)提出了气象云平台MeteCloud (Meteorological Cloud)架构和集群部署的过程。MeteCloud架构包括:硬件层、平台层、应用层和用户层。构架中引入Facebook AvatarNode工作机制解决元数据节点的单点故障问题,分析了AvatarNode的工作原理以及运行周期。 (3)研究MeteCloud平台下转存静态的气象日值数据文件过程。分别研究Hive转存气象日值数据文件过程HiveDaily和HBase转存气象日值数据文件过程HBaseDaily。同时,基于MapReduce编程模型,提出了HBase优化转存过程MRHBaseDaily (MapReduce-based HBaseDaily),以提高HBase转存效率。 (4)研究基于MapReduce的气象日值数据处理过程。文中分析了传统情况下的本地文件系统中SMT(Statistics of Maximum of Temperature,最高气温统计)过程。提出了基于MapReduce的MRSMT (MapReduce-based SMT)气象日值数据统计过程。 通过在实验室构建MeteCloud平台,对气象日值数据文件进行转存和数据处理。结果证明MeteCloud能够高效地进行气象日值数据的存储和处理,优化后的HBase存储过程和MRSMT过程能够提高转存和数据处理效率。
[Abstract]:At present, meteorological departments at all levels in the meteorological industry have independent business systems and storage systems. Meteorological data can not be managed and shared efficiently. The emergence and rapid development of cloud computing technology provides a solution to this problem. Based on the analysis of relevant theoretical models of cloud platform, this paper selects the daily data file of climate data of China ground international exchange station (1951 to 2012) as the research object. The main tasks are as follows: 1) the architecture of HDFS, a distributed file system based on open source cloud platform Hadoop, is analyzed. The data flow is read and written, and the data processing flow of the model MapReduce is calculated. Distributed database HBase architecture, creating tabular process and data warehouse Hive architecture, storing and querying data. (2) the meteorological cloud platform MeteCloud is proposed. The architecture and cluster deployment process. MeteCloud architecture includes: the hardware layer. Platform layer, application layer and user layer. The working mechanism of Facebook AvatarNode is introduced into the framework to solve the single point failure problem of metadata node. The working principle and running cycle of AvatarNode are analyzed. 3). This paper studies the process of transferring static meteorological daily data file under MeteCloud platform. The HiveDaily and HBase transfer weather days are studied respectively in Hive transfer meteorological daily data file process. Value data file procedure HBaseDaily. at the same time. Based on MapReduce programming model. In this paper, MRHBaseDaily MapReduce-based based on optimal storage process of HBase is proposed. In order to improve the efficiency of HBase transfer. The processing process of weather daily value data based on MapReduce is studied. In this paper, the SMTs in the traditional local file system are analyzed. Statistics of Maximum of Temperature. The statistical process of daily meteorological data of MRSMT MapReduce-based SMT based on MapReduce is presented. By building the MeteCloud platform in the lab. The results show that MeteCloud can store and process the daily meteorological data efficiently. The optimized HBase stored procedure and MRSMT procedure can improve the efficiency of storage and data processing.
【学位授予单位】:南京信息工程大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333
【参考文献】
相关硕士学位论文 前1条
1 邰建华;Hadoop平台下的海量数据存储技术研究[D];东北石油大学;2012年
,本文编号:1435355
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1435355.html