并行网络文件系统中负载均衡机制的研究与实现
发布时间:2018-05-08 18:49
本文选题:并行网络文件系统 + 负载均衡 ; 参考:《华中科技大学》2012年硕士论文
【摘要】:负载均衡技术是并行文件系统中常见且不可或缺的重要优化手段。在并行文件系统中实现副本技术,利用负载分析方法和调度算法,可以有效地将系统中各种负载均衡地分配到各存储节点上,以提高文件系统的可用性、稳定性和服务质量。 在并行文件系统中,随着节点的增加、删除,文件的创建、删除和修改,以及访问次数的爆炸性增长,各个存储节点的负载以不可预知的方式动态变化,常常发生部分服务器资源损耗过度而另外一些服务器使用率低下的情况,即负载倾斜。同时,节点的突发故障极易导致系统无法正常运行。虽然使用副本机制有助于解决上述问题,但在不同场景下,还需要针对场景中的I/O行为特征,,以副本为基础,围绕负载均衡技术进行研究,找出适用的策略和时机。本文针对一种典型的并行文件系统pNFS(Parallel Network File System),设计并实现了基于概率分布的动态负载均衡机制PDDB(Probability Distribution Dynamic Balance)。 在并行网络文件系统中,PDDB通过对文件创建副本,按照容量均衡的方式进行文件副本的合理放置,同时设计了镜像和交错的两种副本放置模式。PDDB在各数据服务器上构建了自适应的负载监控系统,由元数据服务器收集各个节点的CPU、内存、存储空间、网络带宽、磁盘带宽等负载信息,经过对负载信息的整合处理,并结合已经获得的历史数据信息,以综合负载的大小决定任务分配的概率,将访问均衡地分配到当前负载较低的一组服务器上,并通过热点迁移调度各个节点之间的负载状态,避免群聚效应。并加入副本元数据的管理以及副本一致性的维护,保证当任一副本的存储节点发生故障时,可以通过其他副本正常获取文件内容,确保系统的正常运行。 在测试中发现,与使用随机算法和最小负载优先算法的负载均衡机制相比较,PDDB机制使得各存储节点之间的最大负载差距比二者分别减少了42%和30%,且在文件系统总负载相同时,各个节点的负载变化也相对平缓,平均负载至少降低10%以上。文件系统的平均网络吞吐率提升20%,可靠性、稳定性和可扩展性均得到改善。
[Abstract]:Load balancing is a common and indispensable optimization method in parallel file systems. By using load analysis method and scheduling algorithm, replicas in parallel file systems can be effectively distributed to each storage node in order to improve the availability, stability and quality of service of the file system. In parallel file system, with the increase of nodes, deletion, creation, deletion and modification of files, and the explosive increase of access, the load of each storage node changes in an unpredictable way. Some server resources are overused and others are underutilized, that is, load tilting. At the same time, the sudden failure of nodes can easily lead to the normal operation of the system. Although using replica mechanism is helpful to solve the above problems, it is necessary to study the load balancing technology based on replicas in different scenarios, and find out the appropriate strategies and opportunities according to the behavior characteristics of I / O in different scenarios. For a typical parallel file system, pNFS(Parallel Network File system, a dynamic load balancing mechanism based on probability distribution, PDDB(Probability Distribution Dynamic balance, is designed and implemented in this paper. In parallel network file system, PDDB can make a copy of the file by making a copy of the file, and make a reasonable arrangement of the copy of the file according to the way of capacity balance. At the same time, two replica placement modes, mirroring and interleaving, are designed. PDDB constructs an adaptive load monitoring system on each data server. The metadata server collects the CPU, memory, storage space and network bandwidth of each node. The load information, such as disk bandwidth, is processed by the integration of the load information, and combined with the historical data obtained, the probability of task allocation is determined by the size of the comprehensive load. Access is distributed evenly to a group of servers with low current load, and the load state between nodes is scheduled by hot spot migration to avoid clustering effect. The management of replica metadata and the maintenance of replica consistency are added to ensure that the contents of files can be obtained through other replicas when any storage node of the replica fails to ensure the normal operation of the system. It was found in the test that the PDDB mechanism reduced the maximum load gap between storage nodes by 42% and 30% respectively compared with the load balancing mechanism using random algorithm and minimum load first algorithm, and when the total load of the file system was the same, the PDDB mechanism reduced the maximum load gap between each storage node by 42% and 30% respectively, and when the total load of the file system was the same, The load change of each node is relatively smooth, the average load is reduced by more than 10%. The average network throughput of the file system is increased by 20%, and the reliability, stability and scalability are improved.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP338.6
【参考文献】
相关期刊论文 前4条
1 庞丽萍,许俊,徐婕,岳建辉;PVFS数据访问的负载平衡[J];华中科技大学学报(自然科学版);2004年07期
2 龚梅;王鹏;吴跃;;一种集群系统的透明动态反馈负载均衡算法[J];计算机应用;2007年11期
3 张媛;于冠龙;卢泽新;刘亚萍;;并行网络文件系统PNFS性能评测与分析[J];计算机工程与应用;2009年35期
4 陈志刚,李登,曾志文;分布式系统中一种动态负载均衡策略、相关模型及算法研究[J];小型微型计算机系统;2002年12期
本文编号:1862609
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1862609.html