云存储网关的分布式缓存系统的研究与实现
发布时间:2018-07-09 20:46
本文选题:云存储 + 云存储网关 ; 参考:《国防科学技术大学》2012年硕士论文
【摘要】:随着Internet技术的飞速发展,各行各业产生的数据急剧膨胀。传统的海量信息存储系统可扩展性差,只能通过设备升级来实现纵向扩展,导致了管理和运营成本提高。以分布式文件系统为核心的云存储系统在存储容量、可扩展性、可靠性方面表现出了特有的优势,在海量数据存储领域的应用越来越广泛。但主流的云存储系统没有统一的接口,现有的建立在不同系统之上的应用无法直接访问这些系统,且很难实现快速迁移。此外,云存储的数据安全问题也是用户关注的核心问题。 为了满足现有应用到云存储平台的快速迁移和数据安全的需求,课题组设计了云存储网关JoinIn。JoinIn将后端的云存储系统抽象为传统的文件系统,提供标准的POSIX接口供用户使用。JoinIn的元数据服务器位于局域网,访问安全受控,数据存储于后端云存储系统。 本课题针对由于云存储架构导致的数据访问延迟大、吞吐率低等问题,对云存储网关JoinIn的分布式缓存系统进行了研究和实现。JoinIn缓存系统的主要设计思想是,运用缓存“取一次,读多次”的关键思想,利用访问的局部性,将用户访问频率高的内容保存到离用户较近的缓存系统中,当用户再次访问这些数据时,就可以从缓存中快速获取,这样就避免了和后端云存储系统的交互,降低了数据的传输延迟、缓解了后端服务器的负载、节省了带宽。 本文的主要工作和创新包括: 1)提出了云存储网关JoinIn的缓存系统的体系结构,针对内存缓存容量有限和易失的特点,提出了内存和磁盘组成的两级缓存结构,增大了缓存容量,实现了缓存内容的持久化存储。 2)提出了云存储网关JoinIn的缓存系统的替换算法——JoinIn_LRU算法,针对经典的LRU算法没有考虑访问次数的不足,在LRU基础上,提出了综合考虑访问时间间隔和访问次数的算法。 3)设计和实现了基于虚拟节点的一致性哈希缓存集群架构:考虑单节点缓存系统的扩展性,在深入研究一致性哈希算法的基础上,设计实现了分布式缓存集群架构。 本课题搭建了测试环境,,对系统进行完整的功能测试和性能测试,实验结果表明,带有缓存系统的云存储系统,读性能得到了大幅度提高。因此,本文设计的缓存系统是提高云存储系统使用体验的有效手段。
[Abstract]:With the rapid development of Internet technology, the data produced by various industries expand rapidly. The traditional mass information storage system has poor scalability and can only achieve vertical expansion through equipment upgrading, which leads to higher management and operation costs. Cloud storage system with distributed file system as the core has shown its unique advantages in storage capacity, scalability and reliability, and has been applied more and more widely in the field of mass data storage. However, the mainstream cloud storage systems do not have a unified interface, existing applications based on different systems can not directly access these systems, and it is difficult to achieve rapid migration. In addition, the data security of cloud storage is also the core concern of users. In order to meet the requirement of fast migration and data security of cloud storage platform, the cloud storage gateway JoinIn.JoinIn abstracts the cloud storage system from the back-end to the traditional file system. Provides standard POSIX interface for users to use .JoinIn metadata server located in the LAN, access security control, data storage in the back-end cloud storage system. Aiming at the problems of large data access delay and low throughput caused by cloud storage architecture, this paper studies the distributed cache system of JoinIn, a cloud storage gateway, and implements the main design idea of .JoinIn cache system. The key idea of "read many times", take advantage of the locality of the access, save the high-frequency content of the user to a cache system that is close to the user. When the user accesses the data again, he can quickly get it from the cache. In this way, the interaction with the back-end cloud storage system is avoided, the data transfer delay is reduced, the load of the back-end server is alleviated, and the bandwidth is saved. The main work and innovations of this paper are as follows: 1) the architecture of the cloud storage gateway JoinIn cache system is proposed. In view of the limited and volatile memory cache capacity, a two-level cache structure composed of memory and disk is proposed. The cache capacity is increased, and the persistent storage of cached content is realized. 2) the replacement algorithm of JoinIn cache system of cloud storage gateway JoinIn LRU algorithm is proposed. The classical LRU algorithm does not consider the shortage of access times. On the basis of LRU, an algorithm considering access interval and number of access is proposed. 3) A cluster architecture of consistent hash cache based on virtual nodes is designed and implemented: the scalability of single node cache system is considered. A distributed cache cluster architecture is designed and implemented on the basis of in-depth research on consistent hash algorithm. In this paper, a testing environment is set up to test the function and performance of the system. The experimental results show that the reading performance of the cloud storage system with buffer system has been greatly improved. Therefore, the cache system designed in this paper is an effective means to improve the experience of cloud storage system.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP333
本文编号:2110602
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2110602.html