当前位置:主页 > 科技论文 > 计算机论文 >

Hadoop容错能力测试平台的设计与实现

发布时间:2018-06-05 08:27

  本文选题:容错能力评测 + 云计算 ; 参考:《哈尔滨工业大学》2013年硕士论文


【摘要】:随着信息系统中数据量的迅速增长,传统的计算和存储模式已经不能满足日益增长的数据处理和存储需求。在早期分布式处理、并行处理以及网格计算技术的基础之上发展而来的云计算技术成为目前炙手可热的海量数据解决手段。但是,随着云计算平台的普及与推广,可靠性成为它面临的一项重大困难与挑战。容错能力能够从一个层面上反映出系统可靠性的高低。因此,评测云计算平台的容错能力对其可靠性研究具有重要意义。 由于云计算平台的复杂程度高、软件规模大,使得云计算平台的容错能力评测成为一项艰巨的任务。目前针对云计算平台测试已有的研究成果中,涉及容错能力评测的内容较少并且测试方法较为单一,需要更加深入的研究与完善。 评测容错能力的最有效手段是基于故障注入思想的测试方法。本文以开源云计算平台—Hadoop作为研究目标,深入研究Hadoop中核心组件的容错机制。基于现实应用中可能出现的故障类型,结合Hadoop平台的部署结构以及运行机制的特点,,提出了一种多层次的Hadoop容错能力测试框架。从软件健壮性测试、MapReduce故障注入测试、网络故障注入测试和HDFS故障注入测试四个层次出发,分别模拟了云计算平台在应用中可能出现的多种软硬件异常。 依据Hadoop多层次容错能力测试框架,设计了针对Hadoop的容错能力评测平台,实现了多种故障注入工具,形成了对Hadoop的软件健壮性测试以及可能发生节点故障、网络故障、硬盘故障等故障类型的容错能力测试的覆盖。在故障注入过程中,监控和回收云计算平台对于故障的反馈信息,进行结果分析从而向研究人员提供真实可靠的评测结果,最终为云计算平台的容错能力评测提供有力的数据支撑。 为了验证Hadoop容错能力评测方法的可行性,本文使用Hadoop搭建了一个小型的云计算平台环境进行实验。软件健壮性测试发现了Hadoop接口和实现中存在的不足并进行了缺陷定位。在测试环境中进行的节点或进程级失效故障、数据操作失效故障、数据校验故障、资源过载故障和网络故障的注入测试有效地证明了各个故障注入工具的有效性。通过结合Hadoop基准性能测试程序进行故障注入前后性能变化情况的对比,能够对Hadoop平台的容错能力进行定性的评测。
[Abstract]:With the rapid growth of data in information systems, the traditional computing and storage mode can no longer meet the increasing demand for data processing and storage. Cloud computing technology developed on the basis of early distributed processing, parallel processing and grid computing technology has become a hot solution to mass data. However, with the popularization and popularization of cloud computing platform, reliability becomes a major difficulty and challenge. Fault-tolerant ability can reflect the reliability of the system from one level. Therefore, it is important to evaluate the fault tolerance of cloud computing platform. Due to the complexity of cloud computing platform and the large scale of software, it becomes a difficult task to evaluate the fault tolerance of cloud computing platform. At present, among the existing research results of cloud computing platform testing, the content of fault tolerance capability evaluation is less and the test method is relatively single, so it needs more in-depth research and improvement. The test method based on fault injection is the most effective method to evaluate fault tolerance. In this paper, the open source cloud computing platform-Hadoop is taken as the research goal, and the fault-tolerant mechanism of core components in Hadoop is deeply studied. Based on the possible fault types in practical applications and the characteristics of deployment structure and running mechanism of Hadoop platform, a multi-level Hadoop fault-tolerant capability testing framework is proposed. From the four levels of software robustness test MapReduce fault injection test network fault injection test and HDFS fault injection test several software and hardware anomalies in cloud computing platform are simulated respectively. According to the testing framework of Hadoop multi-level fault-tolerant capability, a fault tolerance evaluation platform for Hadoop is designed, and many kinds of fault injection tools are realized. The software robustness test of Hadoop and the possible node faults and network failures are formed. Hard disk failure and other types of fault tolerance test coverage. In the process of fault injection, the feedback information of cloud computing platform is monitored and recycled, and the result analysis is carried out in order to provide the researchers with real and reliable evaluation results. Finally, it provides powerful data support for fault tolerance evaluation of cloud computing platform. In order to verify the feasibility of Hadoop fault-tolerant capability evaluation method, this paper uses Hadoop to build a small cloud computing platform environment for experiments. Software robustness test found the shortcomings of Hadoop interface and implementation, and carried out defect location. The injection tests of node or process level failure, data operation failure, data check fault, resource overload fault and network fault in the test environment effectively prove the effectiveness of each fault injection tool. The fault tolerance of Hadoop platform can be evaluated qualitatively by comparing the performance changes before and after fault injection with Hadoop benchmark performance test program.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP302.8

【相似文献】

相关期刊论文 前10条

1 朱鹏,张平;基于单片机的故障注入系统[J];计算机测量与控制;2004年10期

2 王建莹,孙峻朝,李运策,杨孝宗;FTT-1:一个基于硬件的故障注入器的设计与实现[J];计算机工程与设计;1998年04期

3 王建莹,杨孝宗,徐海智;用软件实现的故障注入工具评估错误检测机制[J];小型微型计算机系统;2000年05期

4 贺R

本文编号:1981333


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1981333.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户513ac***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com