海量数据存储系统的高可靠性关键技术研究与应用

发布时间：2018-08-31 07:37

【摘要】：随着信息技术的不断发展，数据日益成为人们日常生活中重要资源。据统计，2010年全球创建、存储和复制的数字信息总量已达到1.2ZB，2011年的数据量达到里程碑式的1.8ZB，而且这种增长还在加速，预计2015年将达到近8ZB。爆炸式增长的数据必然带来存储设备的持续增加。目前，海量数据存储环境下的现代数据中心的存储节点规模少则几万多则几十万，但在规模巨大的存储环境系统中，磁盘损毁或者存储节点失效已成为一种常态行为；与此同时，因网络连接设备或者存储节点其它元器件造成的数据不可访问或者丢失现象也时有发生。为了满足日益扩展的数据存储需求，人们对数据存储的可靠性，可用性等相关特性提出了更高的要求，传统的常规技术手段根本无法应对当前的形式，如何实现海量数据的低冗余度高可靠性存储已经成为业界面临的一个巨大挑战。因而，本文针对构建低冗余度高可靠性海量数据存储系统的关键问题，在总结了当今数据可靠性增强理论和海量数据存储系统基本架构的基础上，对高性能数据容删数据布局算法以及高可靠性存储架构等方面进行了深入的研究，取得了一定的进展，具体如下： 1．针对已在数据存储系统中有广泛应用的RAID技术，提出了一种新的基于异或运算的水平阵列纠删码：EX-ENOD码。该码能够容许任意三列的随机删除错，并具有极大距离可分性质。根据该码的几何构造特点，提出了一种具有低计算复杂度的译码方法，该方法的计算复杂度要低于目前已知的其它可纠三列随机删除错纠删码的译码方法。同时，该编码方法具有明显的通用性，可以扩展应用到STAR码、EEOD码的译码过程。 2．针对海量存储系统规模不断扩大，，可靠性要求不断提高的需求，本文将在{0,1}符号域上的范德蒙系统编码方法引入存储系统中来。该种编码方法继承了传统有限域上构建的范德蒙编码的参数不受存储节点规模、容错参数限制的优良特性，而且存储效率达到最高，同时该方法突破了传统有限域上构建的编码需要大量查询运算的弊端。基于该编码方法构建的存储系统，在保证其数据仍然可用的情况下，最多可容许系统内部一半的存储节点发生损毁，在该种情况下，系统仅需要与原数据相同的冗余数据量。 3．针对{0,1}符号域上编码矩阵的特点，根据编码矩阵中各行向量中“1”元素的分布，提出了可降低编译码计算复杂度的优化算法。文章同时针对传统译码重构过程重构带宽较高的缺点，提出了基于校验矩阵的译码方法，并根据校验矩阵列向量的特点和存储系统所需要重构数据的数目，给出了一种低带宽重构算法。该种低带宽重构算法，可以推广到所有构建在{0,1}符号域上的编码存储系统。 4．根据编码冗余策略数据布局的特点，设计了一种低冗余度高可靠性海量数据存储系统基础架构。系统将数据消冗和编码冗余可靠性增强技术纳入统一的基础架构，并针对编码冗余数据的分布特点，进行了存储节点的节能设计；针对数据使用特点提出了非均等存储及自适应读取策略；并提出了数据消冗与数据验证协同进行的运行策略。
[Abstract]:With the continuous development of information technology, data is increasingly becoming an important resource in people's daily life. According to statistics, the total amount of digital information created in 2010, stored and copied has reached 1.2ZB, the amount of data in 2011 reached a milestone of 1.8ZB, and this growth is accelerating, it is expected to reach nearly 8ZB. explosive growth in 2015 data. At present, the scale of storage nodes in modern data centers under mass data storage environment is tens of thousands or hundreds of thousands, but disk damage or storage node failure has become a normal behavior in large-scale storage environment systems; at the same time, because of network connection equipment or storage. In order to meet the ever-expanding demand for data storage, people put forward higher requirements for the reliability and availability of data storage. Traditional conventional technical means can not cope with the current form, how to achieve massive data. Low redundancy and high reliability storage has become a huge challenge for the industry.
Therefore, aiming at the key problems of constructing low redundancy and high reliability mass data storage system, this paper summarizes the theory of data reliability enhancement and the basic architecture of mass data storage system, and makes a thorough study on high performance data deletion tolerance data layout algorithm and high reliability storage architecture. Some progress has been made as follows:
1. A new horizontal array erasure code, EX-ENOD code, is proposed for RAID technology which has been widely used in data storage systems. The code can allow random deletion of arbitrary three columns and has the property of maximum distance separability. The computational complexity of the proposed method is lower than that of other known decoding methods for three-column random deletion and erasure codes.
2. In order to meet the requirement of increasing scale and reliability of mass storage system, this paper introduces Vandermond system coding method in {0,1} symbol field into storage system. The storage system based on this coding method can allow up to half of the storage nodes in the system to be damaged under the condition that the data is still available. The system only needs the same amount of redundant data as the original data.
3. According to the characteristics of the encoding matrix over {0,1} symbol field and the distribution of `1'elements in each vector of the encoding matrix, an optimization algorithm is proposed to reduce the computational complexity of encoding and decoding. The characteristics of array vectors and the number of data to be reconstructed by the storage system are described. A low bandwidth reconstructing algorithm is proposed. The algorithm can be extended to all coded storage systems built on {0,1} symbol domains.
4. According to the characteristics of data layout of coding redundancy strategy, a low-redundancy and high-reliability mass data storage system infrastructure is designed. The strategy of non-uniform storage and self-adaptive reading is proposed according to the characteristics of data usage.
【学位授予单位】：电子科技大学
【学位级别】：博士
【学位授予年份】：2013
【分类号】：TP333

【参考文献】