集群计算效率约束下的HADOOP鲁棒性优化研究

发布时间：2018-12-29 15:48

【摘要】：随着科技的不断发展以及全球数据量的激增，云存储与云计算是未来的发展趋势，传统数据库对数据的处理已经越来越不能满足个人与企业用户的要求，对于海量数据，业界大数据存储及分布式处理系统最有代表性的就是Hadoop。Hadoop在最近几年迅猛发展，它是一个具有可靠性、高效性、可伸缩性的能够对大量数据进行分布式处理的开源软件框架。由于设计Hadoop之初是假设集群所有机器都是同构的，而现实中，Hadoop集群是有许多廉价机器组成，这就导致了集群中的节点计算能力的差异以及节点容易失效的问题，虽然Hadoop为了防止计算任务和数据存储可能会失败而维护了多个数据副本，以提高集群的容错能力与可靠性。但是在预测节点失效与数据副本放置以及任务调度上仍然需要完善和改进。为了提高Hadoop集群的鲁棒性，本文在不同性能的节点执行任务效率的差异下对其鲁棒性进行了优化，研究的主要内容如下：（1）针对Hadoop在任务节点的选取与数据副本放置时未考虑节点未来可能会失效的问题，提出了Hadoop节点故障预测模型，对集群中的节点进行了故障率预测。（2）通过节点故障预测模型，对于Hadoop任务调度进行了优化以及提出了关于数据副本放置的节点选择策略算法。解决了默认算法未考虑节点异构性而造成的计算能力差异的问题，提高了集群的鲁棒性。（3）对于集群中执行任务次数较少以及通过节点故障预测模型判断出高故障率的节点，建立了休眠机制，，解决了该类节点的处置问题。（4）通过搭建Hadoop集群验证了故障预测模型在集群计算效率约束下的有效性，本文所提出的方法提高了Hadoop集群的鲁棒性。
[Abstract]:With the development of science and technology, cloud storage and cloud computing are the development trend in the future. The traditional data processing can not meet the needs of individuals and enterprise users. Big data storage and distributed processing system is the most representative of the development of Hadoop.Hadoop in recent years, it is a reliable, efficient, scalable open source software framework for distributed processing of a large number of data. Since the Hadoop was designed on the assumption that all machines in the cluster are isomorphic, in reality, the Hadoop cluster is made up of many cheap machines, which leads to the difference in the computing power of the nodes in the cluster and the problem that the nodes are prone to failure. Although Hadoop maintains multiple copies of data in order to prevent computing tasks and data storage from failing to improve the fault tolerance and reliability of the cluster. However, the prediction of node failure, data copy placement and task scheduling still needs improvement. In order to improve the robustness of Hadoop cluster, this paper optimizes the robustness of different performance nodes under different task efficiency. The main contents of the research are as follows: (1) aiming at the problem that Hadoop does not consider the possible future failure of Hadoop in the task node selection and data copy placement, a Hadoop node fault prediction model is proposed. The failure rate of the nodes in the cluster is predicted. (2) based on the node fault prediction model, the Hadoop task scheduling is optimized and a node selection strategy algorithm for data replica placement is proposed. It solves the problem that the default algorithm does not take into account the heterogeneity of nodes and improves the robustness of the cluster. (3) for nodes with low number of task execution and high failure rate determined by node fault prediction model, a dormancy mechanism is established to solve the problem of dealing with this kind of nodes. (4) the Hadoop cluster is built to verify the effectiveness of the fault prediction model under the constraint of cluster computing efficiency. The proposed method improves the robustness of the Hadoop cluster.
【学位授予单位】：辽宁大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TP311.13;TP333

【相似文献】