当前位置:主页 > 科技论文 > 计算机论文 >

并行文件存储系统关键技术的研究

发布时间:2018-10-25 13:15
【摘要】:随着互联网的发展、信息化电子化水平的不断提升,数据也呈现爆炸性的增长趋势。虽然传统单机存储技术的容量和性能在过去几十年取得非常大的发展,但是面对海量的数据,单机存储技术仍然力不从心。于是,如何构建一个高性能、大容量、高可靠性与高可扩展性的数据存储系统成为一个重要的问题,在这种大背景下,分布式并行文件存储系统应运而生。 分布式并行文件存储系统是目前计算机学术界与企业界的一个研究热点,各研究机构与企业也已经取得不少成果。但是这些研究机构与企业所推出的产品,大多是针对自身业务需求设计,具有相当大的局限性与不足,,还存在非常大的研究与改进空间。本文主要工作如下: (1)对比分析了GFS、Global File System等目前主流分布式文件存储系统,总结了它们的优势与不足,并提出一种新的分布式文件系统架构与扁平化文件组织形式。 (2)设计了一种基于Hash表的索引结构与一种基于一致性Hash算法的扩展机制,并且通过模拟测试验证了一致性Hash算法具有比传统Hash取模算法更好的扩展性。 (3)通过分析Linux文件系统的实现原理与细节,揭示了其在海量文件存储上的不足,在此基础上设计了一种基于合并机制的存储节点数据存储方案,并作了详细描述,最后通过实验验证了该方案具有比直接基于文件系统的存储方式更好的读写性能。 (4)分析了导致系统负载失衡的两个原因:客户端的访问负载不均问题和热数据问题。针对前一个原因,本文提出了一种基于服务器负载模型与节点静态性能相结合的负载均衡策略,对客户端的访问负载进行均衡;针对后一个原因,本文提出了一种基于数据热度统计的副本数量管理策略,使热数据的副本数量动态增加,达到把负载分摊到多个节点的目的。
[Abstract]:With the development of the Internet and the constant improvement of the electronic level of information, the data also present an explosive growth trend. Although the capacity and performance of the traditional single-machine storage technology have been greatly developed in the past few decades, the single-machine storage technology is still unable to cope with the huge amount of data. Therefore, how to build a high performance, large capacity, high reliability and high scalability data storage system has become an important problem. Under this background, distributed parallel file storage system came into being. Distributed parallel file storage system is a hot research topic in computer academic and business circles at present, and many research institutions and enterprises have also made a lot of achievements. However, most of the products introduced by these research institutions and enterprises are designed according to their own business requirements, which have considerable limitations and shortcomings, and there is still a lot of room for research and improvement. The main work of this paper is as follows: (1) the main distributed file storage systems, such as GFS,Global File System, are compared and analyzed, and their advantages and disadvantages are summarized. A new distributed file system architecture and flat file organization are proposed. (2) an index structure based on Hash table and an extension mechanism based on consistent Hash algorithm are designed. The simulation results show that the consistent Hash algorithm is more scalable than the traditional Hash algorithm. (3) by analyzing the implementation principle and details of the Linux file system, this paper reveals its shortcomings in mass file storage. On this basis, a storage node data storage scheme based on merge mechanism is designed and described in detail. Finally, the experimental results show that the proposed scheme has better read and write performance than the direct file system-based storage. (4) the two causes of the system load imbalance are analyzed: the problem of uneven access to the client and the hot data problem. For the former reason, this paper proposes a load balancing strategy based on the combination of server load model and node static performance to balance the access load of the client. In this paper, a replica quantity management strategy based on data heat statistics is proposed, which can dynamically increase the number of replicas of thermal data and achieve the purpose of distributing the load to multiple nodes.
【学位授予单位】:华南理工大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP333

【参考文献】

相关期刊论文 前6条

1 熊劲,范志华,马捷,唐荣锋,李晖,孟丹;DCFS2的元数据一致性策略[J];计算机研究与发展;2005年06期

2 吴伟;谢长生;韩德志;黄建忠;;海量存储系统中高可扩展性元数据服务器集群设计[J];计算机科学;2007年07期

3 庞丽萍,何飞跃,徐婕,岳建辉;PVFS寄生式元数据管理的设计与实现[J];计算机工程;2004年20期

4 杨德志;许鲁;张建刚;;蓝鲸分布式文件系统元数据服务[J];计算机工程;2008年07期

5 赵旺;曹强;;分布式并行文件系统中锁管理的研究[J];计算机应用研究;2007年09期

6 张晓春;刘引;;浅谈分布式文件系统关键技术[J];科学咨询(决策管理);2009年04期

相关博士学位论文 前2条

1 王建勇;可扩展的单一映象文件系统[D];中国科学院研究生院(计算技术研究所);1999年

2 吴思宁;机群文件系统服务器关键技术研究[D];中国科学院研究生院(计算技术研究所);2004年

相关硕士学位论文 前1条

1 田颖;分布式文件系统中的负载平衡技术研究[D];中国科学院研究生院(计算技术研究所);2003年



本文编号:2293800

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2293800.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户a073d***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com