当前位置:主页 > 科技论文 > 计算机论文 >

支持虚拟集群迁移的On-demand文件并行传输算法

发布时间:2018-01-16 09:37

  本文关键词:支持虚拟集群迁移的On-demand文件并行传输算法 出处:《吉林大学》2013年硕士论文 论文类型:学位论文


  更多相关文章: 虚拟集群 目的端并行 按需调度 文件传输


【摘要】:近年来,虚拟化技术与网格技术广泛结合,使得分布式环境中的虚拟集群常被用来解决各种并行处理问题。虚拟集群的动态性和迁移,使得虚拟集群应用经常涉及到物理集群间的大文件传输问题。另外,随着计算机和Internet技术的广泛普及,,每天都会产生海量的数据,而且以爆炸式的速度增长。网格计算技术也使得大规模数据密集型应用得到广泛发展。在这类应用中,大数据的生成集群、存储集群和处理集群往往分布在相距较远的物理位置上。这些数据文件从远程数据采集点汇集到处理中心进行计算、显示和存储。由此可知,虚拟集群和数据密集型并行处理等问题,都需要在广域网分布式共享计算环境中高效地传输海量数据。在并行处理过程中,我们希望大数据文件能在尽可能短的时间内传输到相应的处理节点,使得数据处理可以并发进行。因此,如何在多集群间快速传输大文件,逐渐成为研究的热点。 目前,许多国内外专家学者对并行处理中大文件传输算法进行了相关研究,分别从调度策略和路由策略两方面,提出一些提高文件传输性能的应用技术。现阶段,文件传输算法的研究侧重于提高传输的并行度和网络带宽利用率、缩短批量文件请求的整体传输时间。当前工作大多使用多重路径传输、多跳路径传输和多副本等方法实现文件的并行传输,但是没有考虑到目的端并行接收文件分片的能力,而只有这样才能更好地提高传输并行度。另外,当前的一些研究工作也没有考虑到对批量文件请求在整个网络中传输的全局控制和冲突协调。 本文针对这类传输问题提出了一个支持虚拟集群迁移的按需文件并行传输算法OFPT(On-demand File Parallel Transfer),OFPT算法的目的是实现批量文件传输请求的整体完成时间的最小化。该算法根据集群内部数据快速传输的特点,将目的端扩展为集群内所有存在外部连接的节点,实现目的端并行,分散单个节点的传输负载。在传输路径上,采用多重路径实现并行传输,对于单一路径,使用多跳路径散列的方法并灵活调整路径的跳数限制,以获取最优传输路径。对于批量文件传输请求,依据每个请求的传输负载,在全局范围内按需分配网络带宽,解决多个请求的路径间的带宽冲突,从而提高网络资源的带宽利用率,快速实现传输批量文件请求。 本文使用NS2仿真软件,模拟多个集群间批量大文件传输的实验环境,并进行了详尽的实验。依次测试了传输模式、文件副本数、传输负载等因素对本算法的影响。最终实验结果表明,本文提出的按需文件并行传输算法OFPT有效地提高了网络资源利用率,在吞吐量等传输性能上明显好于当前的广域网中的大文件传输算法,达到了本文工作的预期。
[Abstract]:In recent years, virtualization technology and grid technology have been widely combined, making the virtual cluster in distributed environment is often used to solve a variety of parallel processing problems, virtual cluster dynamic and migration. Virtual cluster applications often involve the problem of large file transfer between physical clusters. In addition, with the wide spread of computer and Internet technology, large amounts of data are generated every day. Grid computing technology also makes large-scale data-intensive applications widely developed. In such applications, big data's generation cluster. The storage cluster and the processing cluster are often distributed in the physical location far from each other. These data files are collected from the remote data collection point to the processing center for calculation, display and storage. Virtual cluster and data-intensive parallel processing need to transfer mass data efficiently in WAN distributed shared computing environment. We hope that the big data file can be transferred to the corresponding processing node in as short a time as possible, so that data processing can be carried out concurrently. Therefore, how to quickly transfer large files between multiple clusters. Gradually become the hot spot of research. At present, many domestic and foreign experts and scholars have carried on the related research to the parallel processing big file transfer algorithm, respectively from the scheduling policy and the routing policy two aspects. This paper proposes some application techniques to improve the performance of file transfer. At present, the research of file transfer algorithm focuses on improving the parallelism of transmission and the utilization of network bandwidth. The whole transmission time of batch file request is shortened. Most of the current work uses multi-path transmission, multi-hop path transmission and multi-copy method to realize file parallel transmission. However, the ability of receiving files in parallel is not taken into account, and only in this way can the transmission parallelism be improved. Some current research work has not considered the global control and conflict coordination of batch file request transmission throughout the network. In this paper, an on-demand parallel file transfer algorithm, OFPT1, is proposed to support virtual cluster migration. On-demand File Parallel transfer. The purpose of OFPT algorithm is to minimize the overall completion time of batch file transfer request. The algorithm is based on the characteristics of fast data transfer within the cluster. The destination end is extended to all the nodes with external connection in the cluster to realize the parallel of the destination end and to disperse the transmission load of the single node. In the transmission path, the multi-path is used to realize the parallel transmission, and for the single path, the multi-path is used to realize the parallel transmission. The method of multi-hop path hashing is used and the number of hops of the path is adjusted flexibly to obtain the optimal transmission path. In order to solve the bandwidth conflict between the paths of multiple requests the bandwidth utilization of network resources can be improved and the batch file requests can be quickly realized by allocating the network bandwidth according to the global demand. In this paper, NS2 simulation software is used to simulate the experimental environment of batch large file transfer among clusters, and detailed experiments have been carried out, and the transfer mode and the number of file copies have been tested in turn. Finally, the experimental results show that the on-demand parallel file transfer algorithm OFPT can effectively improve the utilization of network resources. The transmission performance is obviously better than the current large file transfer algorithm in WAN, which is up to the expectation of this paper.
【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP393.093;TP338.6

【共引文献】

相关硕士学位论文 前1条

1 刘宏亮;BitTorrent核心算法研究与改进[D];北京交通大学;2008年



本文编号:1432526

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1432526.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户c7a3e***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com