层次混合存储系统中缓存和预取技术研究
发布时间:2018-01-12 05:15
本文关键词:层次混合存储系统中缓存和预取技术研究 出处:《华中科技大学》2013年博士论文 论文类型:学位论文
【摘要】:随着固态盘(Solid State Disk, SSD)的普及,基于SSD和HDD (Hard Disk Drive)的混合存储系统成为研究热点。混合存储系统综合了SSD的高IOPS、低延时和HDD的大容量、低成本的优点。目前混合存储系统有三类主要的组织方式:(1)SSD作为读缓存和写缓冲区来加速磁盘I/O;(2)磁盘作为写缓冲区来减少SSD写入;(3)SSD和磁盘均作为永久存储,通过数据迁移和重映射来优化系统性能。其中,方式(1)只需在原有磁盘存储系统之上放置少量SSD充当Cache便可大幅提高系统IOPS,具有低成本、易部署的优势,应用最为广泛。 SSD具有写擦(Program/Erase, P/E)次数受限的特点,从而限制其在磁盘缓存上的应用,同时其内部繁重的垃圾回收任务会增加I/O的存取时延。此外,SSD作为Cache时,其大容量的特征会在很大程度上破坏底层存储系统的访问局部性,影响到底层存储设备的I/O性能。针对SSD Cache带来的寿命、性能和局部性弱化问题,通过缓存(Caching)和预取(Prefetching)技术以优化由SSD和HDD构成的混合存储系统已成为工业界和学术界的重要课题。 首先,提出了一种基于SSD缓存(Flash Cache)和RAID的层次型混合存储架构RAF (Random Access First),该架构引入适用于Flash Cache的成本收益模型,并支持随机数据优先的选择性Cache插入策略。RAF将负载中顺序访问数据交给磁盘层,SSD仅需缓存那些具有高收益的随机数据,从而延长SSD的寿命并缩短系统响应时间。此外,RAF通过将SSD缓存划分为读、写区域,使得闪存层无效页的分布更加集中,从而有利于回收块的选取。实验结果表明,与采用相同硬件配置的FlashCache相比,其平均响应时间缩短了17%,磨损减少了53%。 其次,主存、SSD缓存以及磁盘构成了多级缓存结构,针对当前多级缓存结构中SSD Cache受冷数据污染的问题,提出了一种基于层间访问特征的旁路缓存算法CHPA (Characteristics between Hierarchies byPassing cache Algorithm)。CHPA是一种用于降低SSD写开销而设计的非同步多级缓存算法,通过数据块在DRAM Cache内和不同层间的访问特征来预测热数据块,当数据块被上层DRAM Cache淘汰时,热数据块会被插入SSD缓存,而冷数据会绕过SSD层以减少写入开销。 再次,提出了一种基于SSD的顺序预取策略FLAP (FLash-Aware Pre fetching)。 FLAP是一种具有高精确度的激进式预取策略,通过基于关系图的量化分析模型,并借助于SSD缓存容量大的优势,在Cache缺失时对磁盘设备进行高精度、大长度预取,以节约预取成本。此外,通过从SSD空间中专门划分出预取区域来存储预取数据,并采用时间相关的数据布局策略,FLAP使预取区域的垃圾回收效率大幅优化,从而将预取操作对SSD寿命的影响降低到最小。 最后,由于DRAM和SSD组成的两层缓存已经过滤了大部分具有强时间局部性的数据访问,因而在底层磁盘系统(如RAID系统)中顺序预取的作用变的更为重要。为此,提出了一种于面向分条的顺序流预取算法SoAP (Strip-oriented Asynchronous Prefetching)。 SoAP是一种专门为并行磁盘系统设计的预取策略,通过将预取请求与分条(Strip)边界对齐,并拆分为基于分条的子请求,来解决顺序性缺失(Sequentiality Loss)等问题。此外,借助多队列机制和异步调度策略,SoAP利用磁盘空闲带宽来执行预取操作,从而降低了预取开销。
[Abstract]:With the solid state disk (Solid State Disk, SSD) SSD and HDD based on the popularity of (Hard Disk Drive) hybrid storage system has become a research hotspot. The hybrid storage system combines high IOPS SSD and HDD, large capacity and low delay, low cost. There are three main types of organization of mixed storage system: (1) SSD as read and write buffer cache to speed up disk I/O; (2) as disk write buffer to reduce the SSD writing; (3) SSD and disk as a permanent storage, through data migration and re mapping to optimize the performance of the system. Among them, the way (1) only needs to be put in a small amount of SSD on the original disk storage system as Cache can significantly improve system IOPS, has the advantages of low cost, easy deployment, the most widely used.
SSD has written (Program/Erase, P/E) cleaning the limited number of features, which limits its application in the disk cache on the garbage collection tasks at the same time the internal heavy will increase the access delay of I/O. In addition, as SSD Cache, the local access characteristics of the large capacity will destroy the underlying storage system to a great extent the performance of I/O bottom layer storage device. For SSD Cache brings the problem of life, weakening performance and locality, caching and prefetching (Caching) (Prefetching) hybrid technology to optimize the storage system composed of SSD and HDD has become an important topic in academia and industry.
First, we propose a SSD based cache (Flash Cache) hierarchical hybrid storage architecture and RAID RAF (Random Access First), the cost income model is introduced into this architecture for Flash Cache, and support the selective Cache random data priority insertion strategy.RAF will load in order to access the data to disk, only SSD to cache random data those with high income, so as to extend the life span of SSD and shorten the response time of the system. In addition, RAF SSD will write through cache area divided into reading, the distribution of flash layer invalid page more centralized, and is conducive to the recovery from the block selection. The experimental results show that, compared with the same hardware configuration FlashCache, the average response time is shortened by 17%, the wear is reduced by 53%.
Second, memory, and disk cache SSD constitute a multi-level cache structure, in view of the current structure of SSD Cache in a multi-level cache by the cold data pollution problem, proposes a bypass cache algorithm CHPA layer access feature based (Characteristics between Hierarchies byPassing cache Algorithm.CHPA) is designed for reducing overhead and non SSD synchronous multi cache algorithm, prediction data block through the data blocks in the DRAM and Cache in different layers of access to features, when the data block is eliminated when the upper DRAM Cache, data block will be inserted into the SSD cache, and cold data will bypass the SSD layer to reduce the write overhead.
Again, put forward a strategy based on the FLAP sequence of pre SSD (FLash-Aware Pre fetching). FLAP is a radical pre fetching strategy with high accuracy, by analyzing the quantitative relationship model based on the graph, and with the help of the advantages of SSD cache capacity, high precision of the disk device in the absence of Cache. Large length prefetching, in order to save the cost of prefetching. In addition, the SSD space from a specialized division of the prefetch area to store prefetched data, and the data layout strategy of the time, the FLAP prefetching region significantly optimize the efficiency of garbage collection, which will prefetch operation influence on the life of SSD is reduced to the minimum.
Finally, because of the two layers of DRAM and SSD is composed of cache filter most strong temporal locality of data access, and at the bottom of the disk system (such as RAID) in order to prefetch the effects become more important. Therefore, put forward a kind of oriented strip sequence prefetching algorithm (SoAP Strip-oriented Asynchronous Prefetching). SoAP is dedicated to a parallel disk system design Prefetching Strategy, the prefetch request and the boundary alignment (Strip), and split into sub sub requests based on order to solve the lack of (Sequentiality Loss) and other issues. In addition, with the help of multi queue and asynchronous mechanism scheduling strategy, using SoAP disk idle bandwidth to perform prefetch operation, thereby reducing the prefetch overhead.
【学位授予单位】:华中科技大学
【学位级别】:博士
【学位授予年份】:2013
【分类号】:TP333
【参考文献】
相关期刊论文 前1条
1 章从福;;2008年存储市场十大展望集中于3大领域[J];半导体信息;2008年02期
,本文编号:1412843
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1412843.html