基于hadoop的地震数据分布式存储策略的研究
发布时间:2018-02-14 09:30
本文关键词: HADOOP 地震数据 分布式 分布式计算 出处:《东北石油大学》2014年硕士论文 论文类型:学位论文
【摘要】:在实际地震资料的处理时,影响数据处理效率的因素有很多,从整体上说,影响地震数据处理效率主要分为软件和硬件两个方面,也就是访问方法和访问环境的配置。但是由于访问方法的不断开发优化和服务器存储访问环境的更新需求造成了巨大的经费开销的同时访问方法的优化也越来越困难。 为了解决访问方法优化开发的瓶颈和存储服务器更新代价两方面问题,本文通过对地震数据存储特性的研究,基于Hadoop对当前大数据存储访问技术,提出基于Hadoop的地震数据分布式存储策略,并通过该存储策略优化地震数据的存储访问环境,提高设备利用率。本文具体研究内容如下: 1.Hadoop的地震数据分布式存储适应性研究; 对Hadoop分布式框架的数据存储结构与地震数据的数据结构、访问特性等方面进行适应性研究,同时对地震数据分布式存储所需要考虑的组织结构、集群配置因素进行考量。通过Hadoop的数据访问方法与地震数据访问方法的有效结合,以廉价集群为前提,提出地震数据分布式存储策略的整体框架。 2.地震数据分布式存储的组织策略; 根据Hadoop集群环境的特性,,对地震数据的分块大小、数据块分配、数据完整性进行组织,组织之后对环境参数合理配置,使之更高效的存储在Hadoop的分布式文件系统中。并通过实验来验证最符合于地震数据特性的环境参数配置及最优的数据组织策略。 3.基于Hadoop的地震数据存取模块的设计; 为了进一步验证Hadoop对地震数据的分布式计算的优势,本文将通过对Hadoop编程框架MapReduce和目前地震数据存取模块同时进行开发,并将两种环境下的存取模块进行对比,通过改变相应的环境参数来验证Hadoop地震数据分布式存储的高效性,并得出分布式节点个数和数据大小的不同对数据访问效率的影响。 最后综合本文的研究内容,实现其各个优化技术,提出完整的地震数据分布式存储策略。以此来验证本文提出的相关优化技术和方法的可行性和有效性。
[Abstract]:In the actual seismic data processing, there are many factors that affect the data processing efficiency. On the whole, the seismic data processing efficiency is mainly divided into two aspects: software and hardware. But due to the continuous development and optimization of access methods and the updating requirements of the server storage access environment, it is becoming more and more difficult to optimize the access methods because of the huge cost of the access methods and the configuration of the access environment. In order to solve the bottleneck of access method optimization development and the cost of storage server update, this paper studied the characteristics of seismic data storage, based on Hadoop to big data storage access technology. This paper proposes a distributed storage strategy for seismic data based on Hadoop, and optimizes the storage and access environment of seismic data through the strategy to improve the utilization of equipment. The specific contents of this paper are as follows:. 1. Research on Hadoop's adaptability to distributed storage of seismic data; The adaptability of the data storage structure of the Hadoop distributed framework and the data structure and access characteristics of the seismic data are studied. At the same time, the organizational structure that should be considered in the distributed storage of seismic data is also discussed. Through the effective combination of data access method of Hadoop and seismic data access method, the overall framework of distributed storage strategy of seismic data is put forward based on the premise of cheap cluster. 2.Organizing strategy of distributed storage of seismic data; According to the characteristics of Hadoop cluster environment, the block size, data block distribution and data integrity of seismic data are organized, and the environmental parameters are reasonably configured after organizing. It can be stored in the distributed file system of Hadoop more efficiently, and the best configuration of environment parameters and the optimal data organization strategy are verified by experiments. 3. Design of seismic data access module based on Hadoop; In order to further verify the advantages of Hadoop in distributed computing of seismic data, this paper will develop the Hadoop programming framework MapReduce and the current seismic data access module at the same time, and compare the access modules in the two environments. The efficiency of distributed storage of Hadoop seismic data is verified by changing the corresponding environmental parameters, and the effect of the number of distributed nodes and the size of data on the efficiency of data access is obtained. Finally, by synthesizing the research contents of this paper, the optimization techniques are realized, and a complete distributed storage strategy of seismic data is proposed to verify the feasibility and effectiveness of the related optimization techniques and methods proposed in this paper.
【学位授予单位】:东北石油大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP333
【参考文献】
相关期刊论文 前7条
1 崔丽美,谢传节,杨联安,张蕾;基于XML Schema地球系统科学数据的元数据扩展机制[J];测绘学报;2005年03期
2 任燕舞;;多操作系统平台间的数据共享[J];福建电脑;2009年03期
3 邵家元;;地震勘探技术的发展及主要物探技术的比较[J];低碳世界;2013年03期
4 张成阳,穆志纯,孙德辉;Internet鲁棒性与HOT模型初探[J];计算机应用;2004年02期
5 詹玲;马骏;陈伯江;陈维梁;吕睿;;分布式I/O日志回放系统的设计与实现[J];计算机工程与应用;2010年36期
6 陈龙;王国胤;;一种细粒度数据完整性检验方法[J];软件学报;2009年04期
7 曹孟起;;地震数据处理技术进展[J];石油科技论坛;2008年05期
相关博士学位论文 前2条
1 安宝宇;云存储中数据完整性保护关键技术研究[D];北京邮电大学;2012年
2 韩晶;大数据服务若干关键技术研究[D];北京邮电大学;2013年
本文编号:1510391
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1510391.html