MapReduce框架下的任务调度算法研究
发布时间:2018-04-16 10:09
本文选题:MapReduce + Hadoop ; 参考:《南京理工大学》2017年硕士论文
【摘要】:近年来大数据计算已成为研究热点,Hadoop和Spark都是基于MapReduce框架的广泛应用的大数据计算平台,其性能主要取决于任务调度的优劣。因此,基于MapReduce框架的Hadoop和Spark环境下任务调度算法的研究具有一定的理论价值和实际意义。本文重点研究:Hadoop环境下批处理作业调度算法和Spark环境下Web服务的资源分配方法。针对Hadoop环境下优化最大完工时间的批处理作业调度问题,本文将该问题模型化为具有准备时间的两阶段混合流水作业调度问题,并基于DAG(Directed Acyclic Graph)模型提出启发式算法 DAGEA(Directed Acyclic Graph Earliest Available)和DAGEF(Directed Acyclic Graph Earliest Finish)。现有求解具有准备时间的两阶段混合流水作业调度的算法往往基于甘特图构造,此方法无法有效考虑各作业的可调度范围。不同于此,DAGEA、DAGEF基于DAG构造,通过DAG计算各作业的可调度范围并合理调整作业的开始时间,从而有效提高算法的性能和效率。模拟实验验证了该结论。Spark计算基于内存,而Hadoop计算基于磁盘。Spark目前资源分配考虑空余核数和内存等大粒度资源,本文在Spark环境下Web服务资源调度增加考虑集群节点CPU利用率和处理能力等资源使用情况,重新评估每个节点资源利用率,再分配资源给任务。新的资源调度方法MEAN缩小资源粒度,从而提高集群资源利用率,增加Web请求处理数,提高并发性。任务调度和资源分配是分布式大数据计算平台的核心,其质量直接决定平台的性能。本文研究基于MapReduce框架的任务调度算法,重点研究Hadoop环境下批处理调度算法和Spark环境下Web服务的资源分配方法,分别提出DAGEA、DAGEF和MEAN算法,实验表明所提算法的有效性。
[Abstract]:In recent years, big data computing has become a hot research topic. Both Hadoop and Spark are widely used platforms based on MapReduce framework. The performance of big data computing platform mainly depends on the quality of task scheduling.Therefore, the research of task scheduling algorithm based on MapReduce framework in Hadoop and Spark environment has certain theoretical value and practical significance.This paper focuses on the task scheduling algorithm of batch processing under the environment of: Hadoop and the resource allocation method of Web service in Spark environment.Aiming at the batch scheduling problem which optimizes the maximum completion time in Hadoop environment, this paper models the problem as a two-stage mixed flow job scheduling problem with preparation time.A heuristic algorithm DAGEA(Directed Acyclic Graph Earliest available and DAGEF(Directed Acyclic Graph Earliest finish are proposed based on DAG(Directed Acyclic Graph model.The existing algorithms for solving two-stage mixed flow job scheduling with preparation time are often constructed based on Gantt graph. This method can not effectively consider the schedulable range of each job.Different from the DAG structure, the schedulable range of each job is calculated by DAG and the start time of the job is adjusted reasonably, so that the performance and efficiency of the algorithm can be improved effectively.The simulation results show that the Spark calculation is based on memory, while the Hadoop calculation is based on disk Spark's current resource allocation, which takes into account large granularity resources such as the number of spare cores and memory.In this paper, Web service resource scheduling in Spark environment takes into account the utilization of cluster nodes' CPU utilization and processing power, and reevaluates the utilization of each node's resources, and assigns the resources to the task.A new resource scheduling method, MEAN, reduces the granularity of resources, improves the utilization of cluster resources, increases the number of Web requests, and improves concurrency.Task scheduling and resource allocation are the core of the distributed big data computing platform, whose quality directly determines the performance of the platform.In this paper, the task scheduling algorithm based on MapReduce framework is studied, and the batch scheduling algorithm under Hadoop environment and the resource allocation method of Web service under Spark environment are studied. The DAGEAA DAGEF and MEAN algorithms are proposed, respectively. Experiments show that the proposed algorithm is effective.
【学位授予单位】:南京理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.09;TP311.13
【相似文献】
相关期刊论文 前10条
1 禄乐滨,刘明东;一种基于函数的多任务调度算法[J];空军工程大学学报(自然科学版);2000年02期
2 阮幼林 ,刘干 ,朱光喜 ,卢小峰;一个基于复制的相关任务调度算法[J];小型微型计算机系统;2005年03期
3 杨斌;张建军;;一个新的基于通信竞争的任务调度算法[J];计算机工程与应用;2007年33期
4 胡同福;王文生;谢能付;;设备网格中的任务调度算法[J];计算机工程与设计;2008年12期
5 周艳慧;张凯;;新的分布式任务调度算法[J];计算机系统应用;2008年10期
6 薛继伟;姜波;刘庆强;王征;;基于能力感知的人机任务调度算法[J];计算机工程;2009年19期
7 曹晓磊;程东年;黄万伟;;基于离散时间距的在线可重构任务调度算法[J];小型微型计算机系统;2010年10期
8 韩晓亚;汪斌强;黄万伟;王保进;;采用配置完成优先策略的可重构任务调度算法[J];小型微型计算机系统;2012年03期
9 杨丽;武小年;商可e,
本文编号:1758417
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1758417.html