分布式对象文件系统的缓存策略研究

发布时间：2018-08-16 13:38

【摘要】：对象存储提供了跨平台、高可靠性、高性能的存储体系结构，满足了大规模数据的存储需求。但是，对象存储系统也会遇到读写性能瓶颈，在存储设备性能提升有限的情况下，缓存技术的发展提供了有效的解决方案。针对Cappella文件系统，设计了一种基于分布式对象文件系统的缓存策略。在对象服务器端实现了一套用户态的缓存方案，达到了对原有缓存方案的优化和完善。缓存策略实现了缓存空间的统一管理和调度，提高了缓存空间分配和回收的效率，并能对缓存资源使用情况做出智能的响应，对缓存读写策略进行调节与控制。当缓存资源使用率达到上限值时，缓存策略开始启动写缓存持久化和读缓存淘汰操作。当缓存资源使用率降低到下限值时，，再停止读缓存的淘汰操作。写缓存持久化操作将把前一时间窗口内写入缓存的数据持久化到磁盘，读缓存淘汰操作将依次释放读缓存权值较低的缓存资源。缓存写策略实现了数据的延迟写入和聚合，有效的减少了写时延和写磁盘的次数。缓存读策略能根据读请求的特点，优化数据的读取，实现了顺序预读和缓存部分命中条件下的读方案优化选择。缓存策略通过直接I/O方式，能在较少内存使用率下，不依赖Linux内核页高速缓存实现更好的读写性能。结合实际环境对相关参数进行优化配置，缓存策略也可以很好的利用在其他分布式对象文件系统上。测试结果表明，缓存策略实现了对原有方案的较大优化。在大文件顺序读写的情况下比Linux内核页高速缓存性能更优。
[Abstract]:Object storage provides a cross-platform, high-reliability, high-performance storage architecture to meet the storage needs of large-scale data. However, the object storage system also meets the bottleneck of read and write performance. With the limited performance improvement of storage devices, the development of cache technology provides an effective solution. A cache strategy based on distributed object file system is designed for Cappella file system. A user caching scheme is implemented on the object server, which optimizes and perfects the original cache scheme. Cache policy realizes the unified management and scheduling of cache space, improves the efficiency of cache space allocation and recovery, and can make an intelligent response to the use of cache resources, and adjusts and controls the cache read and write policy. When the cache resource utilization reaches the upper limit, the cache policy starts the write cache persistence and read cache elimination operations. When cache resource utilization is reduced to a lower limit, the read cache phase out is stopped. The write cache persistence operation will persist the data written in the previous time window to disk and the read cache elimination operation will release the cache resources with lower read cache weight in turn. The cache write strategy realizes the delayed writing and aggregation of data, and effectively reduces the write delay and the number of write disks. The cache reading strategy can optimize the reading of data according to the characteristics of the read request and realize the optimal selection of the reading scheme under the condition of sequential prereading and partial hit of cache. The caching strategy can achieve better read and write performance without relying on Linux kernel page cache without less memory usage through direct I / O mode. Combined with the actual environment to optimize the configuration of the relevant parameters, the cache strategy can also be used in other distributed object file systems. The test results show that the cache policy achieves a large optimization of the original scheme. In the case of large file sequential reading and writing than the Linux kernel page cache performance is better.
【学位授予单位】：华中科技大学
【学位级别】：硕士
【学位授予年份】：2013
【分类号】：TP333

【参考文献】