当前位置:主页 > 科技论文 > 搜索引擎论文 >

架构及业务驱动的HBase测试方法研究

发布时间:2018-05-12 23:34

  本文选题:测试模型 + 数据分片 ; 参考:《华东理工大学》2013年硕士论文


【摘要】:互联网技术的发展和广泛应用导致产生了大量的数据,关系型数据库无法满足新兴业务的需求。为了应对海量数据的存储和管理需求,各大厂商或组织纷纷推出了不同的NoSQL存储方案,HBase就是其中著名的方案之一,在各个行业有着非常广泛的应用。在此背景下本文开展了HBase存储方案性能测试方法的研究。 首先,本文介绍了NoSQL数据库的兴起和发展过程,回顾了数据库性能测试的研究现状和性能测试的一般过程,发现传统的性能测试方法没有充分的考虑系统架构对性能的影响,而现行的NoSQL数据库系统架构中的一些新特性,如系统的动态调度、数据的多副本存放策略和可配置性对性能都有着极其重要的影响。同时在特定业务中的不同用户行为下,系统的性能也存在着很大的差异。基于此,本文提出了NoSQL数据库系统性能测试的模型,涵盖架构驱动的性能测试和业务驱动的性能测试两个方面,弥补了传统方法的不足,并以HBase为例开展了实践研究。 基于架构驱动的性能测试,本文研究了HBase面向列族存储的数据模型、数据的读写过程、数据表的分片式存储和数据的备份存储策略等方面的具体实现方式,分析了其中影响其性能的架构要素,包括同等数据大小下同一列族的单域与多域、单列族和多列族、数据分片的大小以及数据备份因子等,基于这些架构中的性能影响因素分别设计并实现了相应的测试方案,测试了各要素对HBase性能的影响程度。 基于业务驱动的性能测试,本文研究了HBase在搜索引擎业务中的性能测试方法。根据搜索引擎业务中用户访问量的变化具有周期性这一特点,本文提出了基于时间序列的潜周期测试模型,并将该模型应用到搜索引擎业务驱动的HBase性能测试上,另外在建模的过程中使用小波技术对异常用户访问量数据进行了过滤。同时为了更加真实的反映HBase在搜索引擎业务中的性能,设计了基于用户行为特征的性能测试方案,并以YCSB测试套件为基础具体实现并验证了这种测试方案。 最后,对本文的工作进行了总结,并对将来的工作进行了展望。
[Abstract]:With the development and wide application of Internet technology, large amounts of data are produced, and relational databases can not meet the needs of new business. In order to meet the demand of storage and management of massive data, various manufacturers or organizations have introduced different NoSQL storage schemes, which is one of the famous ones, and has been widely used in various industries. In this context, the performance testing method of HBase storage scheme is studied. First of all, this paper introduces the rise and development of NoSQL database, reviews the research status of database performance testing and the general process of performance testing, and finds that the traditional performance testing methods do not fully consider the impact of system architecture on performance. However, some new features in the current NoSQL database architecture, such as dynamic scheduling of the system, multi-replica storage strategy and configurability of data, have an extremely important impact on performance. At the same time, the performance of the system also varies greatly under the different user behavior in the specific business. Based on this, this paper puts forward a performance test model of NoSQL database system, which covers two aspects: architecture driven performance test and business driven performance test, which make up for the shortcomings of traditional methods, and take HBase as an example to carry out practical research. Based on the performance test driven by architecture, this paper studies the implementation of HBase data model for column family storage, the process of data reading and writing, the piecewise storage of data tables and the backup storage strategy of data, etc. The architectural elements that affect its performance are analyzed, including single-domain and multi-domain of the same column family under the same data size, single-column family and multi-column family, size of data slice and data backup factor, etc. Based on the performance factors in these architectures, the corresponding test schemes are designed and implemented, and the influence of each factor on the performance of HBase is tested. Based on business-driven performance testing, this paper studies the performance testing method of HBase in search engine business. According to the periodicity of users' visits in search engine business, a time series based latent cycle test model is proposed in this paper, and the model is applied to search engine business-driven HBase performance testing. In addition, wavelet technology is used to filter the abnormal user access data in the process of modeling. At the same time, in order to reflect the performance of HBase in search engine business more realistically, a performance test scheme based on user behavior characteristics is designed, and the test scheme is implemented and verified based on YCSB test suite. Finally, the work of this paper is summarized and the future work is prospected.
【学位授予单位】:华东理工大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP311.13

【参考文献】

相关期刊论文 前7条

1 佘青;;利用Apache Jmeter进行Web性能测试的研究[J];智能计算机与应用;2012年02期

2 闫宏飞,李晓明;关于中国Web的大小、形状和结构[J];计算机研究与发展;2002年08期

3 韩光涛,张元英;SPECweb99:最新Web服务器性能基准测试[J];计算机应用研究;2000年04期

4 李昆霖;;浅析性能测试[J];科技信息;2012年09期

5 桑圣洪;胡飞;;性能测试工具LoadRunner的工作机理及关键技术研究[J];科学技术与工程;2007年06期

6 于戈,王国仁,王欣晖,郑怀远;一个面向对象数据库系统的TPC-C测试与分析[J];软件学报;1999年09期

7 白敬培;潘清;冯建峰;王映东;;分布式海量数据管理系统Hypertable关键技术分析[J];网络安全技术与应用;2009年05期

相关硕士学位论文 前2条

1 刘亮;基于HTTP/S协议的Web性能测试工具的设计与实现[D];内蒙古大学;2005年

2 刘洋;复杂业务环境下软件性能测试模型的设计与应用[D];东北大学 ;2009年



本文编号:1880661

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1880661.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户b0bb1***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com