当前位置:主页 > 科技论文 > 计算机论文 >

一种分层次数据去冗技术研究

发布时间:2018-03-25 07:39

  本文选题:去冗系统 切入点:分层次架构 出处:《电子科技大学》2013年硕士论文


【摘要】:随着企业和个人用户数据迅速增长,对数据中心的存储能力要求越来越高。统计显示在这些海量数据中,有相当的一部分是冗余数据,如何检测并删除这些冗余数据,提高数据中心存储性能已经变得越发迫切,也非常具有实用价值。 本文一开始介绍了去冗的一些背景知识,分析了各大主要厂商去冗产品,介绍了相关的技术,在此基础上完成了以下工作: 首先设计了一种分层次的去冗余架构,采用控制服务器和信息服务器分离的方法,使其分别用于事务处理和文件元数据存放。在信息服务器中,数据分层存放:文件指纹信息常驻内存,分块数据的元数据置于固态硬盘或者磁盘,真实文件数据存放于廉价的存储设备,从而合理利用内存和磁盘空间,提高效率。 其次在预处理模块中,把数据进行分类处理,提出一种基于字节的最大递增序列分块算法,,即BFMIS算法,有效解决不定长分块中的硬分块问题。针对去冗系统中关键的数据碰撞难题,对经典的SHA-1算法进行优化,改进SHA-1算法中的步函数,增强消息修改的扩展程度,并增加消息摘要的长度,提高SHA-1算法的抗碰撞性,降低去冗系统的误删率。提出多维Bloom Filter算法,对普通BloomFilter算法进行位数组扩展,降低其误判率,解决海量数据冗余检测问题,并增强Bloom Filter算法在分布式环境下的动态伸缩性,提高整个去冗系统的扩展性。 论文阐述RFID网络中标签数据冗余问题以及CLIF,INPFM去冗机制,并把分层次去冗框架应用于RFID网络中,把RFID标签数据作为经过预处理后的元数据信息,进行分层组织和去冗。 最后进行了实验测试。结果表明,优化后的SHA-1算法有效的提高了整体抗碰撞性;多维Bloom Filter算法有效降低了误判率,提升了动态伸缩性;多层次RFID去冗算法在时间效率和去冗率方面都优于已有的算法,但存在一定数量的误判;系统整体的吞吐量和去冗率都达到了预期的目标。
[Abstract]:With the rapid growth of enterprise and personal user data, the storage capacity of data centers is becoming more and more demanding. Statistics show that a considerable part of these massive data is redundant data, how to detect and delete these redundant data, Improving the storage performance of data centers has become increasingly urgent and of great practical value. At the beginning of this paper, we introduce some background knowledge of de-redundancy, analyze the main manufacturers' deredundant products, and introduce the related technologies. On this basis, we have completed the following work:. Firstly, a hierarchical deredundancy architecture is designed, which is used to separate the control server from the information server, which is used for transaction processing and file metadata storage, respectively. Data hierarchical storage: file fingerprint information resident memory, block data on solid state hard disk or disk, real file data stored in cheap storage device, so that reasonable use of memory and disk space, improve efficiency. Secondly, in the preprocessing module, the data is classified and processed, and a block algorithm of the largest increment sequence based on bytes, that is, the BFMIS algorithm, is proposed. Aiming at the key data collision problem in the deredundant system, the classical SHA-1 algorithm is optimized, the step function in the SHA-1 algorithm is improved, and the extension of message modification is enhanced. It also increases the length of message digest, improves the anti-collision performance of SHA-1 algorithm, and reduces the error-deletion rate of de-redundancy system. A multi-dimensional Bloom Filter algorithm is proposed to extend the bit-array of common BloomFilter algorithm to reduce its error rate, and to solve the problem of redundant detection of mass data. The dynamic scalability of Bloom Filter algorithm in distributed environment is enhanced, and the extensibility of the whole deredundant system is improved. In this paper, the problem of label data redundancy in RFID network and the delamination mechanism of CLIF-INPFM are described. The hierarchical delamination framework is applied to RFID network, and the RFID tag data is used as the metadata information after preprocessing to organize and deredundancy. Finally, the experimental results show that the optimized SHA-1 algorithm can effectively improve the overall anti-collision performance, and the multidimensional Bloom Filter algorithm can effectively reduce the misjudgment rate and improve the dynamic scalability. The multilevel RFID de-redundancy algorithm is superior to the existing algorithms in terms of time efficiency and de-redundancy rate, but there is a certain number of misjudgment, and the overall throughput and de-redundancy rate of the system have achieved the expected goal.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333

【参考文献】

相关期刊论文 前6条

1 肖明忠,代亚非,李晓明;拆分型Bloom Filter[J];电子学报;2004年02期

2 蒋邵岗;谭杰;;RFID中间件数据处理与过滤方法的研究[J];计算机应用;2008年10期

3 王灿;秦志光;王娟;蔡博;;基于文件相似性分簇的重复数据消除模型[J];计算机应用研究;2012年05期

4 敖莉;舒继武;李明强;;重复数据删除技术[J];软件学报;2010年05期

5 吴永祥;;射频识别(RFID)技术研究现状及发展展望[J];微计算机信息;2006年32期

6 王文闯;郭凤宇;;基于动态时间窗的射频识别中间件数据过滤算法[J];信息与电子工程;2009年03期

相关重要报纸文章 前1条

1 杨洋;[N];网络世界;2009年

相关硕士学位论文 前2条

1 高梦颖;存储系统中多维元数据索引的高效更新方法研究[D];华中科技大学;2011年

2 王锦;RSA加密算法的研究[D];沈阳工业大学;2006年



本文编号:1662156

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1662156.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户18d71***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com