支持Hadoop配置的异构虚拟机平台的研究
发布时间:2018-09-12 12:29
【摘要】:随着云计算技术的发展,各种大小不一的数据中心纷纷出现,而这些数据中心往往存在各种虚拟机管理平台(如Eucalyptus, OpenNebula和OpenStack等),应用场景需求也完全不同,各种管理平台要求不同的运维、开发技术和经验,不同管理平台问的服务器资源不能动态共享,影响了弹性服务的性能。同时由于平台中的不同的机器配置进而也将影响其上层运行的云计算应用。Hadoop作为已广泛应用于数据密集型计算的云计算应用之一,其中的MapReduce框架可配置参数的正确配置对计算的性能有着不可忽视的影响。然而,当遇到异构的Hadoop集群时,用户一般只能使用默认配置或者依照经验进行手工配置,由于参数调优时可选择的空间很大,这样常常会导致错误地配置致使计算性能下降。 针对多种多样的虚拟机平台的问题,本文设计并实现了一个异构虚拟机管理平台。在不需要改变既有的虚拟机管理平台结构的基础上,实现对现有主流的虚拟机管理平台的统一管理和控制、虚拟资源的均衡分配;同时还提供可扩展的适配层接口和驱动部件,支持其它异构的虚拟机供应和管理平台。 针对异构虚拟机平台上的Hadoop应用的问题,本文提出了一种基于增强学习的MapReduce在线参数自动配置方法。该方法利用离线学习粗粒度地创建初始化策略,在线学习根据策略细粒度地配置参数,并通过试错法迭代地更新Q值表使得配置结果接近最优。实验结果表明,该配置方法可以有效地提高Hadoop的性能,并且能快速迭代实现收敛,使运行MapReduce任务的机器资源得到充分使用,缩短任务的运行时间。
[Abstract]:With the development of cloud computing technology, a variety of data centers of different sizes have emerged, and these data centers often have a variety of virtual machine management platforms (such as Eucalyptus, OpenNebula and OpenStack), and the requirements of application scenarios are completely different. Different management platforms require different operation and maintenance, development technology and experience. The server resources of different management platforms can not be dynamically shared, which affects the performance of flexible services. At the same time, because of the different machine configuration in the platform, the cloud computing application. Hadoop, which affects the upper layer of the platform, will be one of the cloud computing applications that have been widely used in data-intensive computing. The correct configuration of the configurable parameters of the MapReduce framework has an important effect on the performance of the calculation. However, when a heterogeneous Hadoop cluster is encountered, the user can only use the default configuration or manual configuration according to experience. Due to the large space available for parameter tuning, this often leads to poor performance due to misconfiguration. Aiming at the problems of various virtual machine platforms, this paper designs and implements a heterogeneous virtual machine management platform. On the basis of not changing the structure of the existing virtual machine management platform, the unified management and control of the existing mainstream virtual machine management platform and the balanced allocation of virtual resources are realized, and the extensible adaptation layer interface and driver components are also provided. Support for other heterogeneous virtual machine provisioning and management platforms. In order to solve the problem of Hadoop application on heterogeneous virtual machine platform, this paper presents a method of MapReduce online parameter automatic configuration based on reinforcement learning. This method uses off-line learning coarse-grained to create initialization strategy, on-line learning configures parameters according to the policy fine-grained, and iteratively updates the Q value table by trial and error method to make the configuration result close to optimal. The experimental results show that the proposed configuration method can effectively improve the performance of Hadoop, and can quickly iterate to achieve convergence, make full use of the machine resources running MapReduce tasks, and shorten the running time of the tasks.
【学位授予单位】:中南大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP302
本文编号:2239011
[Abstract]:With the development of cloud computing technology, a variety of data centers of different sizes have emerged, and these data centers often have a variety of virtual machine management platforms (such as Eucalyptus, OpenNebula and OpenStack), and the requirements of application scenarios are completely different. Different management platforms require different operation and maintenance, development technology and experience. The server resources of different management platforms can not be dynamically shared, which affects the performance of flexible services. At the same time, because of the different machine configuration in the platform, the cloud computing application. Hadoop, which affects the upper layer of the platform, will be one of the cloud computing applications that have been widely used in data-intensive computing. The correct configuration of the configurable parameters of the MapReduce framework has an important effect on the performance of the calculation. However, when a heterogeneous Hadoop cluster is encountered, the user can only use the default configuration or manual configuration according to experience. Due to the large space available for parameter tuning, this often leads to poor performance due to misconfiguration. Aiming at the problems of various virtual machine platforms, this paper designs and implements a heterogeneous virtual machine management platform. On the basis of not changing the structure of the existing virtual machine management platform, the unified management and control of the existing mainstream virtual machine management platform and the balanced allocation of virtual resources are realized, and the extensible adaptation layer interface and driver components are also provided. Support for other heterogeneous virtual machine provisioning and management platforms. In order to solve the problem of Hadoop application on heterogeneous virtual machine platform, this paper presents a method of MapReduce online parameter automatic configuration based on reinforcement learning. This method uses off-line learning coarse-grained to create initialization strategy, on-line learning configures parameters according to the policy fine-grained, and iteratively updates the Q value table by trial and error method to make the configuration result close to optimal. The experimental results show that the proposed configuration method can effectively improve the performance of Hadoop, and can quickly iterate to achieve convergence, make full use of the machine resources running MapReduce tasks, and shorten the running time of the tasks.
【学位授予单位】:中南大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP302
【参考文献】
相关期刊论文 前10条
1 张帆;李磊;杨成胡;陈丽珍;;基于Eucalyptus构建私有云计算平台[J];电信科学;2011年11期
2 崔巍;李益发;斯雪明;;基于Eucalyptus的基础设施即服务云框架协议设计[J];电子与信息学报;2012年07期
3 张倩;齐德昱;;面向服务的云制造协同设计平台[J];华南理工大学学报(自然科学版);2011年12期
4 柴玉梅;景慧敏;;一种在多Agent系统中求帕累托效率解的方法[J];计算机工程与应用;2010年22期
5 公伟;刘培玉;迟学芝;贾娴;;云取证模型的构建与分析[J];计算机工程;2012年11期
6 温少君;陈俊杰;郭涛;;一种云平台中优化的虚拟机部署机制[J];计算机工程;2012年11期
7 柳香;李俊红;段胜业;;基于混沌PSO算法的Hadoop配置优化[J];计算机工程;2012年11期
8 杨星;马自堂;孙磊;;云环境下基于性能向量的虚拟机部署算法[J];计算机应用;2012年01期
9 顾昊;钱晓俊;梁洪亮;;开源平台下软件管理技术的研究[J];计算机应用研究;2007年08期
10 陈康;郑纬民;;云计算:系统实例与研究现状[J];软件学报;2009年05期
,本文编号:2239011
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2239011.html