NVDIMM在分布式存储数据保护中的应用研究
发布时间:2019-01-14 13:18
【摘要】:自20世纪90年代以来,分布式存储系统逐渐在存储业界崭露头角,也逐步进入到各种关键应用领域,其数据安全性和性能也逐渐受到挑战。为了确保性能,大部分分布式存储系统以牺牲数据可靠性为代价,引入了数据缓存技术,导致分布式存储系统面临数据安全风险和数据一致性修复性能的挑战。为了降低缓存数据丢失的风险,不同的存储系统引入了不同的技术。磁盘阵列存储系统通常采用电池来保护内存中的数据,部分分布式存储系统也采用带电池保护的内存进行数据保护;大部分分布式存储系统,采用缓存备份技术,将缓存数据同时备份到多个节点,以降低单个节点断电带来的影响。随着技术的发展,在系统掉电时采用超级电容保证数据从内存到闪存安全转移的非易失性内存模组NVDIMM(non-volaitle dual inline memory module)逐渐成熟,Intel当前一代的处理器已经将NVDIMM作为标准支持设备。本研究拟将NVDIMM应用于分布式存储系统,为整个系统提供数据缓存掉电保护功能,提升分布式存储系统的数据安全性及性能。为此本文提出一种NVDIMM缓存的使用管理方式,实现对缓存数据的全方位保护,确保在分布式存储系统出现掉电等故障后,通过简单的数据恢复策略,保证不出现数据丢失的情况。为了保证数据一致性,本文详细设计了基于NVDIMM的分布式日志策略,保证分布式存储系统掉电后多节点数据的一致性。另外,由于分布式存储系统存储的文件数量非常多,故障恢复时间较长,本文针对性的提出了基于NVDIMM的数据一致性修复策略,保证掉电系统恢复后,能够将海量小文件数据快速修复到完整状态。通过以上的研究,可以基于NVDIMM构建完整的分布式存储系统的缓存数据保护方案。本文作者利用实际的测试验证了NVDIMM缓存数据保护的效果。主要包含异常掉电后的数据保护、数据读写性能和数据一致性修复性能的提高等。根据这些实际评测的结果,部署了NVDIMM的分布式存储系统可以在数据节点异常掉电的情况下对数据进行保护,并使得写带宽提高34.11%,写操作响应时间缩短25.75%,海量小文件时异常断电后数据修复的速度提升约20倍。
[Abstract]:Since the 1990s, distributed storage systems have gradually emerged in the storage industry, and gradually entered into various key applications, and their data security and performance have been gradually challenged. In order to ensure performance, most distributed storage systems introduce data cache technology at the expense of data reliability, which makes distributed storage systems face the challenge of data security risk and data consistency repair performance. In order to reduce the risk of cache data loss, different storage systems introduce different technologies. Disk array storage systems usually use batteries to protect the data in memory, and some distributed storage systems also use memory with battery protection for data protection. Most distributed storage systems use cache backup technology to backup cached data to multiple nodes at the same time in order to reduce the impact of single node power failure. With the development of technology, the non-volatile memory module NVDIMM (non-volaitle dual inline memory module), which can safely transfer the data from memory to flash memory, is becoming more and more mature when the system is power down. Intel's current generation of processors already uses NVDIMM as a standard support device. In this study, NVDIMM is applied to the distributed storage system to provide the whole system with the function of data buffering and power-off protection, and to improve the data security and performance of the distributed storage system. In this paper, a management method of NVDIMM cache is proposed, which can protect the cache data in all directions and ensure that the data will not be lost by a simple data recovery strategy after the failure of distributed storage system such as power failure. In order to ensure data consistency, the distributed log strategy based on NVDIMM is designed in detail to ensure the consistency of multi-node data after the distributed storage system power down. In addition, due to the large number of files stored in distributed storage system and the long time of fault recovery, this paper puts forward a data consistency repair strategy based on NVDIMM to ensure the recovery of power failure system. Large amount of small file data can be quickly repaired to the full state. Through the above research, we can construct a complete cache data protection scheme of distributed storage system based on NVDIMM. The author verifies the effectiveness of NVDIMM cache data protection by actual test. It mainly includes the data protection after abnormal power failure, the improvement of data read and write performance and data consistency repair performance. According to these practical results, the distributed storage system with NVDIMM can protect the data in the case of abnormal power loss of the data node, increase the write bandwidth by 34.11, and shorten the response time of the write operation by 25.75. Massive small files when abnormal power failure after the speed of data repair about 20 times.
【学位授予单位】:中国科学院大学(中国科学院工程管理与信息技术学院)
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP333
[Abstract]:Since the 1990s, distributed storage systems have gradually emerged in the storage industry, and gradually entered into various key applications, and their data security and performance have been gradually challenged. In order to ensure performance, most distributed storage systems introduce data cache technology at the expense of data reliability, which makes distributed storage systems face the challenge of data security risk and data consistency repair performance. In order to reduce the risk of cache data loss, different storage systems introduce different technologies. Disk array storage systems usually use batteries to protect the data in memory, and some distributed storage systems also use memory with battery protection for data protection. Most distributed storage systems use cache backup technology to backup cached data to multiple nodes at the same time in order to reduce the impact of single node power failure. With the development of technology, the non-volatile memory module NVDIMM (non-volaitle dual inline memory module), which can safely transfer the data from memory to flash memory, is becoming more and more mature when the system is power down. Intel's current generation of processors already uses NVDIMM as a standard support device. In this study, NVDIMM is applied to the distributed storage system to provide the whole system with the function of data buffering and power-off protection, and to improve the data security and performance of the distributed storage system. In this paper, a management method of NVDIMM cache is proposed, which can protect the cache data in all directions and ensure that the data will not be lost by a simple data recovery strategy after the failure of distributed storage system such as power failure. In order to ensure data consistency, the distributed log strategy based on NVDIMM is designed in detail to ensure the consistency of multi-node data after the distributed storage system power down. In addition, due to the large number of files stored in distributed storage system and the long time of fault recovery, this paper puts forward a data consistency repair strategy based on NVDIMM to ensure the recovery of power failure system. Large amount of small file data can be quickly repaired to the full state. Through the above research, we can construct a complete cache data protection scheme of distributed storage system based on NVDIMM. The author verifies the effectiveness of NVDIMM cache data protection by actual test. It mainly includes the data protection after abnormal power failure, the improvement of data read and write performance and data consistency repair performance. According to these practical results, the distributed storage system with NVDIMM can protect the data in the case of abnormal power loss of the data node, increase the write bandwidth by 34.11, and shorten the response time of the write operation by 25.75. Massive small files when abnormal power failure after the speed of data repair about 20 times.
【学位授予单位】:中国科学院大学(中国科学院工程管理与信息技术学院)
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP333
【相似文献】
相关期刊论文 前10条
1 ;廉价、高效、稳定 微软新一代分布式存储系统[J];新电脑;2006年06期
2 何公明;张元涛;;面向数字媒体的高性能分布式存储系统的研究与应用[J];广播电视信息;2009年10期
3 范剑波,郭建康;分布式存储系统性能模型的建立与应用[J];计算机工程与应用;2001年13期
4 范剑波,徐利浩;分布式存储系统可靠性的研究[J];计算机工程;2001年06期
5 吴英;谢广军;刘t,
本文编号:2408731
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2408731.html