面向高性能计算环境的作业优化调度模型的设计与实现
发布时间:2018-12-19 20:23
【摘要】:高性能计算环境聚合了多个分布在不同地域、不同组织机构的高性能计算资源,面向用户提供统一的访问入口和使用方式,由系统中间件根据用户作业请求匹配合适的高性能计算资源。随着环境应用编程接口的开放以及作业请求数量的大幅增加,面对高并发作业提交请求时,目前采用的即时调度模型会由于网络等原因导致一定数量的请求处理失败,同时缺乏灵活性。针对此问题,优化了环境作业调度模型,引入作业环境队列,细化了作业系统层状态,增加了作业调度策略可配置性,并基于环境中间件SCE实现了系统原型。经测试,在单核心服务每分钟处理近200个作业提交请求的工作负载下,无因系统和网络原因引起的作业提交出错现象;在共计1 000个作业中,近500个作业提交命令请求在0.3s以内完成,800余个作业提交命令请求在0.5s以内完成。
[Abstract]:The high performance computing environment aggregates many high performance computing resources distributed in different regions and different organizations, and provides users with uniform access and usage methods. The system middleware matches the appropriate high performance computing resources according to the user's job request. With the opening of the environment application programming interface and the large increase of the number of job requests, when the high concurrent jobs submit requests, the current instant scheduling model will lead to a certain number of requests processing failure due to network and other reasons. At the same time, lack of flexibility. Aiming at this problem, the environment job scheduling model is optimized, the job environment queue is introduced, the state of the job system layer is refined, the job scheduling policy is configurable, and the prototype of the system is implemented based on the environment middleware SCE. After testing, under the workload of processing nearly 200 job submission requests per minute by single core service, there is no error phenomenon caused by system and network reasons. Of the 1 000 jobs, nearly 500 job requests are completed within 0.3 s, and more than 800 job submission requests are completed within 0.5 s.
【作者单位】: 中国科学院计算机网络信息中心;
【基金】:国家重点研发计划项目(2016YFB0201404) 十二五863重大项目(2014AA01A302)
【分类号】:TP38
本文编号:2387401
[Abstract]:The high performance computing environment aggregates many high performance computing resources distributed in different regions and different organizations, and provides users with uniform access and usage methods. The system middleware matches the appropriate high performance computing resources according to the user's job request. With the opening of the environment application programming interface and the large increase of the number of job requests, when the high concurrent jobs submit requests, the current instant scheduling model will lead to a certain number of requests processing failure due to network and other reasons. At the same time, lack of flexibility. Aiming at this problem, the environment job scheduling model is optimized, the job environment queue is introduced, the state of the job system layer is refined, the job scheduling policy is configurable, and the prototype of the system is implemented based on the environment middleware SCE. After testing, under the workload of processing nearly 200 job submission requests per minute by single core service, there is no error phenomenon caused by system and network reasons. Of the 1 000 jobs, nearly 500 job requests are completed within 0.3 s, and more than 800 job submission requests are completed within 0.5 s.
【作者单位】: 中国科学院计算机网络信息中心;
【基金】:国家重点研发计划项目(2016YFB0201404) 十二五863重大项目(2014AA01A302)
【分类号】:TP38
【相似文献】
相关期刊论文 前5条
1 陈江峰;;浅谈中职计算机电子作业管理的有效方案[J];中等职业教育;2012年22期
2 李小娟;M340中型机作业保护与自动重新启动[J];厦门大学学报(自然科学版);1994年03期
3 邓正宏;张小芳;;过程化作业网络调度方法的研究[J];微电子学与计算机;2008年04期
4 王英;;基于FTP的中职学校计算机实验作业管理[J];福建电脑;2012年04期
5 ;[J];;年期
相关重要报纸文章 前1条
1 ;日立JP1全系统运维管理解决方案[N];电脑商报;2006年
相关硕士学位论文 前2条
1 王晨;PSO-BP模型在VDT作业疲劳评价中的应用研究[D];首都经济贸易大学;2012年
2 付云虹;基于BACKFILL的并行计算作业调度算法研究[D];湖南大学;2007年
,本文编号:2387401
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2387401.html