当前位置:主页 > 科技论文 > 计算机论文 >

基于HDFS的文件传输策略的研究与实现

发布时间:2018-06-13 13:20

  本文选题:HDFS + BT ; 参考:《吉林大学》2013年硕士论文


【摘要】:随着计算机与网络的迅猛发展,数据量也与日激增,据思科在其《全球云数据报告中》指出,由于用户和企业的不受限制地访问及应用数据的需求,在2010至2015年期间,全球云数据流量将以每年66%的速度增长。传统的分布式高性能处理平台处理数据的能力已经满足不了井喷式增长的数据处理请求。应运而生的云计算、云存储则满足此类数据密集性的服务请求。 本文研究内容是基于Hadoop的云系列服务系统的文件系统HDFS在其读写过程中传输策略的研究。旨在通过对HDFS读写等基本操作的效率提高及容错来实现更高效率及高度容错的文件系统。云存储中的每个操作都离不开对文件系统的调用,对文件系统HDFS中最基本、最常用的读写操作的传输过程进行研究并改进,实现并行读写及高度容错,这会极大程度提高云存储服务中数据的访问速度及可用性问题。 本文首先介绍了云存储的相关理论和技术,阐释了云存储的定义,并对其应用场合加以描述。紧跟着又对HDFS进行了透彻的剖析,并对其相关技术进行了解析和比较。随后对HDFS读写容错需要的技术进行详尽的描述,为接下来的研究提供的良好的技术保障。最后在通过研究HDFS读写机制进行分析之后,对HDFS的读写机制中文件传输进行改进,实现文件数据块级的并行传输功能。从而为云存储的高延迟及副本安全性问题提供一个可靠的解决方案。 本文主要实现了云存储文件系统的文件系统HDFS中文件读写过程中的并行传输策略,及改进的副本自动复制策略,提高了读写效率,降低延迟时间,为云存储用户提供高效并稳定的服务。在本文的改进策略下,充分利用了副本的存在,分散了网络的负载,,数据读的效率可提高160%,数据副本的复制效率也大大提高。
[Abstract]:With the rapid development of computers and networks, the amount of data is also increasing, according to Cisco in its Global Cloud data report, due to the unrestricted access and application data needs of users and enterprises, between 2010 and 2015, Global cloud data traffic will grow at an annual rate of 66%. The traditional distributed high performance processing platform can not meet the data processing requirements of blowout growth. Cloud computing comes into being and cloud storage meets such data-intensive service requests. This paper focuses on the research of HDFS file system based on Hadoop in the process of reading and writing. The purpose of this paper is to achieve a more efficient and highly fault-tolerant file system by improving the efficiency and fault tolerance of basic operations such as reading and writing HDFS. Every operation in cloud storage can not be separated from the call of file system. The transmission process of the most basic and commonly used read and write operations in file system HDFS is studied and improved to realize parallel reading and writing and highly fault-tolerant. This greatly improves the speed and availability of data access in cloud storage services. This paper first introduces the theory and technology of cloud storage, explains the definition of cloud storage, and describes its application. Then the HDFS is analyzed thoroughly, and the related technology is analyzed and compared. Then the technology needed for reading and writing fault tolerance in HDFS is described in detail, which provides a good technical guarantee for the following research. Finally, after analyzing the reading and writing mechanism of HDFS, the paper improves the file transmission in HDFS reading and writing mechanism, and realizes the parallel transfer function of file data block level. This provides a reliable solution for high latency and replica security problems in cloud storage. In this paper, the parallel transmission strategy in HDFS file system and the improved replica automatic replication strategy are implemented, which can improve the efficiency of reading and writing and reduce the delay time. Provide efficient and stable services for cloud storage users. Under the improved strategy of this paper, we make full use of the existence of replicas and spread the load of the network. The efficiency of data reading can be improved by 160%, and the efficiency of replication of data replicas can also be greatly improved.
【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333;TP309

【参考文献】

相关期刊论文 前3条

1 樊景超;周国民;;基于Lucene的“农搜”并行索引技术研究[J];农业网络信息;2009年08期

2 吴晟;苏庆堂;罗斌;赵莉楠;蔡灿民;;基于Socket和多线程技术的并发服务器的研究[J];昆明理工大学学报(理工版);2006年04期

3 张树刚;张遂南;黄士坦;;CRC校验码并行计算的FPGA实现[J];计算机技术与发展;2007年02期

相关博士学位论文 前1条

1 宋玮;分布式存储系统中的节点自主性问题研究[D];华南理工大学;2010年

相关硕士学位论文 前2条

1 张倩;基于Linux操作系统的嵌入式视频监控系统的研究与开发[D];天津工业大学;2007年

2 陈华;Linux下BT客户端的设计与实现[D];苏州大学;2009年



本文编号:2014151

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2014151.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户cf10a***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com