基于LSFAPI的分布式集群管理系统的开发
发布时间:2018-02-11 14:13
本文关键词: 分布式计算 LSF PLATFORM 集群 云计算 出处:《长安大学》2012年硕士论文 论文类型:学位论文
【摘要】:随着信息革命的发展,单机的计算能力已经远远不能满足人们进行复杂计算的需求。于是,早在20世纪80年代,人们就提出将多台单机联合起来,形成一个计算能力更加强大的集群,这就是分布式计算。目前国外在分布式领域尚处于领先地位,但是近几年由于政府的大力扶持,国内许多企业也已经建立起了自己的分布式计算平台。本论文是利用国外知名的负载均衡软件LSF,开发出一套分布式集群管理系统,不仅可以用于对用户集群中的计算机进行管理,而且能够根据计算机的性能及调度策略合理地分配任务。 首先,论文分别描述了分布计算、网格计算、并行计算及云计算的特点,并从范围、应用和本质三个方面分别比较了分布式计算和网格计算、分布式计算和并行计算、分布式计算和云计算之间的区别。这些知识为分布式集群管理系统的设计奠定了基础。 其次,论文从LSF的安装及配置、守护进程系统、任务周期系统方面深入地研究了LSF软件的基本架构,然后,从LSF API的基本系统架构、LSF API的批处理系统架构、LSF的基本API服务、LSF的批处理API服务方面系统地介绍了LSF的批处理系统架构。通过对LSF及LSFAPI架构的系统认识,为基于LSFAPI的集群管理系统的开发奠定了基础。 然后,论文着重介绍了分布式集群管理系统的设计及实现。首先,从概要设计入手,描述了系统的各大功能组织之间的关系,并且比较了开发工具和语言之间的区别;然后,重点介绍了登录系统和主程序框架的设计,并从LSFADMIN和BATCHADMIN两大系统结构入手,以其中具有代表性的两个部分为例,详细阐述了其功能设计、后台设计及页面设计。 最后,论文以一条汽车装配生产线为例,描述了该集群管理系统的应用,,然后站在分布式计算与云计算相关联的角度上,提出了云计算的一种构想,即集群管理系统是云计算的一种雏形。
[Abstract]:With the development of the information revolution, the computing power of a single machine is far from satisfying the needs of complex computing. So, as early as 1980s, people proposed to combine a number of single machines. To form a more powerful cluster of computing power, this is distributed computing. At present, foreign countries are still in the leading position in the field of distributed computing, but in recent years, due to the strong support of the government, Many domestic enterprises have also established their own distributed computing platform. This paper develops a distributed cluster management system using LSFs, a well-known load balancing software in foreign countries. It can be used not only to manage the computers in the user cluster, but also to allocate tasks reasonably according to the performance of the computer and the scheduling strategy. Firstly, the paper describes the characteristics of distributed computing, grid computing, parallel computing and cloud computing, and compares distributed computing with grid computing, distributed computing and parallel computing from three aspects of scope, application and nature. The difference between distributed computing and cloud computing. This knowledge lays the foundation for the design of distributed cluster management systems. Secondly, this paper studies the basic architecture of LSF software from the aspects of installation and configuration of LSF, daemon system and task cycle system. From the basic system architecture of LSFAPI, the batch system architecture of LSF API, the basic API service of LSF and the batch processing API service of LSF, this paper systematically introduces the batch processing system architecture of LSF. Through the systematic understanding of LSF and LSFAPI architecture, It lays a foundation for the development of cluster management system based on LSFAPI. Then, the paper focuses on the design and implementation of distributed cluster management system. First, it describes the relationship between the various functional organizations of the system, and compares the differences between development tools and languages. This paper mainly introduces the design of login system and main program frame, and starts with the two system structures of LSFADMIN and BATCHADMIN, taking the representative two parts as an example, expounds in detail its function design, background design and page design. Finally, taking an automobile assembly line as an example, this paper describes the application of the cluster management system, and then puts forward a concept of cloud computing from the perspective of distributed computing and cloud computing. Namely cluster management system is a kind of embryonic form of cloud computing.
【学位授予单位】:长安大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP338.8
【参考文献】
相关硕士学位论文 前3条
1 弋瑞录;实时多任务集群管理系统的研究[D];西北工业大学;2006年
2 李敬;集群系统集中管理平台的研究与实现[D];西北工业大学;2004年
3 闫丽慧;基于网格计算的定理自动证明研究[D];华东师范大学;2007年
本文编号:1503248
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1503248.html