基于混合冗余策略的安全云存储研究
发布时间:2019-05-16 20:03
【摘要】:云存储是一种基于互联网的全新存储模式,为人们提供高性价比和便捷的存储服务,但其安全性是被广泛关注和研究的热点。为了保证数据的可靠性和完整性,云存储中主要采取两种措施来保证用户数据的安全,一方面将用户的数据冗余保存,防止因软硬件原因致使的数据丢失;另一方面是向用户提供数据完整性验证的服务,并且在发现存储系统中有存储节点出错或者失效时能高效的恢复出错数据。 目前,冗余存储主要包括多副本和纠删码两种存储策略。多副本冗余存储策略设计简单、支持高并发访问,但需要付出成倍的空间消耗代价;纠删码冗余存储策略容错能力强、空间利用率高,但编码和译码带来的计算开销和访问延迟降低了用户的体验值。本文首先分析单一冗余存储策略不足,提出一种基于纠删码的动态副本冗余存储方案(Dynamic Replication Based Erasure Codes,DRBEC),在纠删码策略的基础上使用副本策略。考虑到文件修复带宽的开销,,该方案采用再生码作为纠删码的编码方案,将文件进行再生码编码存储,并根据曲线拟合预测的文件访问热度,动态生成和调整文件的副本数量,发挥多副本I/O吞吐性能高的优势。其次,对于处于低动态状态的归档数据,本文将再生码与MD5结合,利用MD5的唯一性给出并实现再生码数据完整性验证和数据恢复方案,对每个再生码分片计算其MD5值并加密随机保存在该文件的各个数据节点上,使用户无需下载原文便可实现对远端数据的完整性验证。 最后,基于Xen虚拟机搭建集群存储实验原型系统,对DRBEC方案的存储空间消耗、访问性能以及再生码编码下基于MD5实现的数据完整性验证方案的可行性和可靠性等进行实验分析。结果表明,混合冗余方案空间利用率高、平均访问延迟低,并且提高用户访问的成功率;同时基于MD5的再生码数据完整性性验证方案有效可靠,降低了存储开销和通信开销,并且准确定位失效节点位置,在低带宽的情况下有效恢复出错数据,保障了数据的完整性和有效性。
[Abstract]:Cloud storage is a new storage mode based on Internet, which provides people with high performance-price ratio and convenient storage service, but its security is the focus of extensive attention and research. In order to ensure the reliability and integrity of data, two main measures are taken to ensure the security of user data in cloud storage. On the one hand, the redundant data of users is saved to prevent the loss of data caused by software and hardware. On the other hand, it provides users with the service of data integrity verification, and can recover the error data efficiently when it is found that there are errors or failures of storage nodes in the storage system. At present, redundant storage mainly includes two storage strategies: multi-copy and rectified code. The design of multi-replica redundant storage policy is simple and supports high concurrent access, but it needs to pay twice the cost of space consumption. Erasure code redundant storage policy has strong fault-tolerant ability and high spatial utilization, but the computational overhead and access delay caused by coding and decoding reduce the experience value of users. In this paper, the shortcomings of single redundant storage strategy are analyzed, and a dynamic replica redundant storage scheme based on erasure code (Dynamic Replication Based Erasure Codes,DRBEC) is proposed, and the replica strategy is used on the basis of erasure code strategy. Considering the cost of file repair bandwidth, the scheme uses regenerated code as the coding scheme of erasure code, stores the file code coding, and dynamically generates and adjusts the number of copies of the file according to the file access heat predicted by curve fitting. Give full play to the advantages of multi-copy I 鈮
本文编号:2478526
[Abstract]:Cloud storage is a new storage mode based on Internet, which provides people with high performance-price ratio and convenient storage service, but its security is the focus of extensive attention and research. In order to ensure the reliability and integrity of data, two main measures are taken to ensure the security of user data in cloud storage. On the one hand, the redundant data of users is saved to prevent the loss of data caused by software and hardware. On the other hand, it provides users with the service of data integrity verification, and can recover the error data efficiently when it is found that there are errors or failures of storage nodes in the storage system. At present, redundant storage mainly includes two storage strategies: multi-copy and rectified code. The design of multi-replica redundant storage policy is simple and supports high concurrent access, but it needs to pay twice the cost of space consumption. Erasure code redundant storage policy has strong fault-tolerant ability and high spatial utilization, but the computational overhead and access delay caused by coding and decoding reduce the experience value of users. In this paper, the shortcomings of single redundant storage strategy are analyzed, and a dynamic replica redundant storage scheme based on erasure code (Dynamic Replication Based Erasure Codes,DRBEC) is proposed, and the replica strategy is used on the basis of erasure code strategy. Considering the cost of file repair bandwidth, the scheme uses regenerated code as the coding scheme of erasure code, stores the file code coding, and dynamically generates and adjusts the number of copies of the file according to the file access heat predicted by curve fitting. Give full play to the advantages of multi-copy I 鈮
本文编号:2478526
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2478526.html