当前位置:主页 > 科技论文 > 软件论文 >

基于预测的Hadoop任务调度算法优化

发布时间:2019-06-19 18:06
【摘要】:在出现落后任务时,现有的Hadoop推测式执行任务调度器会在空闲节点上为该任务进行备份执行,但并未周全地考虑该空闲节点的当前性能,可能造成这次备份任务仍然失败或执行速度非常慢,导致执行备份任务的失败率较高,不仅占用较多系统资源并且延迟了系统响应时间。因此,研究现有的Hadoop任务调度算法,针对备份任务的调度执行提出改进方案,对提高系统性能有非常重要的意义。提出了基于预测的Hadoop任务调度优化算法——CPL(Computation Prediction of Late)调度算法,主要包含两个优化点:首先,在系统中维护两个预测队列,分别为CPU空闲型节点队列和I/O空闲型节点队列。队列内部按执行任务失败率升序排序,在匹配任务和节点类型的基础上,预测即将空闲且失败率低的节点执行备份任务,降低了执行备份任务的失败率;其次,利用Map任务占用CPU时间片段的总和对现有的任务分类算法进行了修正,提出了更加准确的任务类型划分方法。通过Cloud Sim云计算仿真平台进行仿真实验,对CPL调度算法的性能进行了验证。结果表明:CPL调度算法的作业响应时间相比于FIFO调度算法和LATE调度算法分别降低了20%和14%;CPL调度算法比LATE调度算法执行备份任务的失败率平均降低了16%。
[Abstract]:when a backward task is present, the existing Hadoop speculative execution task scheduler performs backup execution for the task on the idle node, but does not fully consider the current performance of the idle node, and may cause the backup task to still fail or the execution speed is very slow, Resulting in a high failure rate to perform a backup task, not only more system resources but also system response times. Therefore, the existing Hadoop task scheduling algorithm is studied, and the improvement scheme is put forward for the scheduling of the backup task, which is of great significance to the improvement of the system performance. In this paper, a prediction-based scheduling algorithm for Hadoop task scheduling is proposed, which mainly includes two optimization points: first, the two prediction queues are maintained in the system, and the idle-type node queue and the I/ O idle-type node queue are respectively reserved for the CPU. according to the ascending order of the execution task failure rate in the queue, on the basis of the matching task and the node type, the node which is to be idle and the failure rate is low is predicted to perform the backup task, the failure rate of executing the backup task is reduced, and secondly, By using the sum of the CPU time segments occupied by the Map task, the existing task classification algorithm is modified, and a more accurate method of the task type is proposed. The performance of the CPL scheduling algorithm is verified by the simulation experiment of the Cloud Sim cloud computing simulation platform. The results show that the operation response time of the CPL scheduling algorithm is reduced by 20% and 14%, respectively, compared with the FIFO scheduling algorithm and the LATE scheduling algorithm, and the failure rate of the CPL scheduling algorithm to perform the backup task by the LATE scheduling algorithm is reduced by 16%.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP301.6

【参考文献】

相关期刊论文 前1条

1 左利云;曹志波;;云计算中调度问题研究综述[J];计算机应用研究;2012年11期



本文编号:2502544

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2502544.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户52d3a***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com