GPU中针对任务完工时间最小化问题的研究

发布时间：2018-10-25 19:49

【摘要】：随着图形处理器(GPU)技术快速发展,GPU已经具有高度的并行性以及灵活的可编程性,这使得GPU在通用计算和并行处理领域得到了广泛研究和应用。GPU作为一种新的计算主体,具有深入研究的价值。对于GPU,用户往往更关注于所有任务在GPU资源上总完成时间。对于一组任务集合,在GPU设备中的完工时间(Makespan)指的是从任务开始执行到所有任务执行完毕所需要的总时间。对于任务集合如何在GPU内部多个流处理器之间调度,以最小化完工时间,以及如何计算一个流处理器上任务完成时间等问题,目前国内外相关研究工作较少,本文针对这两方面问题,提出了相应的解决方法,对提高GPU资源利用率具有极其重要的意义。本文首先根据GPU结构的主要特点,建立了一个关于最小化完工时间的模型,提出了一个使任务集在GPU多个流处理器之间总完工时间最小的调度算法,并且从理论上证明了该算法在最坏情况下的结果不会超过最优解的2倍。此外,针对目前比较常见的GPU结构,提出了一个针对两个流处理器下的改进算法,并通过实验证明了该算法具有更好的效率。另一方面,本文根据问题实际特点,针对流处理器上任务由多个子任务并行执行的特点,提出了三种计算任务完工时间的方法。首先提出了一种悲观计算方法,该方法可计算实际问题可能达到的理论上限；然后,使用二维线性规划方法为问题建立了方程组,给出了精确计算结果；最后结合前两种方法建立了一种易处理问题的优化计算方法。在模拟试验环节,对本文提出的调度算法和计算方法设计了对比试验。实验结果显示,多流处理器调度算法能够获得较好的完工时间结果,而双流处理器环境下的改进算法则比前一个算法更加优秀。在任务在流处理器上完成时间计算上,二维线性规划方法的精确度较高,但当问题达到一定规模量后,其计算所需时间将会较大。而易处理计算问题上的优化计算算法则可以兼顾精确度和速度。
[Abstract]:With the rapid development of graphics processor (GPU) technology, GPU has a high degree of parallelism and flexible programmability, which makes GPU widely studied and applied in the field of general computing and parallel processing. GPU is a new computing subject. It has the value of further study. GPU, users tend to focus more on the total completion time of all tasks on GPU resources. For a set of tasks, the completion time (Makespan) in a GPU device is the total time required from the start of the task to the completion of all tasks. At present, there are few researches on how to schedule the task set between multiple stream processors in GPU to minimize the completion time and how to calculate the task completion time on a stream processor. In view of these two problems, this paper puts forward the corresponding solutions, which is of great significance to improve the utilization of GPU resources. In this paper, according to the main characteristics of GPU architecture, a model for minimizing the completion time is established, and a scheduling algorithm is proposed to minimize the total completion time between multiple GPU stream processors. It is proved theoretically that the result of the algorithm in the worst case is not more than 2 times of the optimal solution. In addition, an improved algorithm based on two stream processors is proposed for the current GPU architecture, and the experimental results show that the algorithm is more efficient. On the other hand, according to the practical characteristics of the problem, this paper proposes three methods to calculate the completion time of the task, aiming at the parallel execution of the task by multiple sub-tasks on the stream processor. In this paper, a pessimistic calculation method is proposed, which can calculate the theoretical upper limit of practical problems, and then the equations are established by using two-dimensional linear programming method, and the exact calculation results are given. In the end, an optimization method for solving the problem is established by combining the first two methods. In the simulation experiment, a comparative test is designed for the scheduling algorithm and calculation method proposed in this paper. Experimental results show that the multi-stream processor scheduling algorithm can obtain better completion time results, while the improved algorithm in the dual-stream processor environment is better than the previous one. The precision of two-dimensional linear programming method is high when the task is completed on the stream processor, but when the problem reaches a certain scale, the computing time will be longer. The optimal calculation algorithm on the easy-processing problem can take into account the accuracy and speed.
【学位授予单位】：东北大学
【学位级别】：硕士
【学位授予年份】：2012
【分类号】：TP332

【参考文献】