面向键值数据库应用的混合存储系统设计与实现
发布时间:2019-06-04 02:40
【摘要】:随着大规模互联网应用的高速发展,给存储系统的可扩展性提出了更高的要求,键值数据库由于其简单高效的数据模型在可扩展性方面较传统的数据库系统有很大的优势。另一方面,随着硬件的发展,固态盘的性价比逐步提高,已经成为越来越多系统的首选,其随机读的优势与互联网应用的特征高度符合,但是其写性能的相对低下以及擦除次数的限制影响了它的应用场景,于是将固态盘与磁盘相结合的融合存储系统得到了大家的广泛关注。因此针对键值数据库而设计的融合存储系统是一个有价值的结合点。 针对Web应用的负载特点,通过日志的方式顺序的记录对键值数据库的操作,所有的操作都仅仅是顺序的写入内存,当内存中的数据达到了一定阈值以后一次刷写到后端存储上,日志的方式一定程度上是通过牺牲读性能来优化写性能,所以和固态盘有天然的互补关系。由于后端固态盘与磁盘的不同特性,使用层次化的方法将写的特点进行分类后写入不同的设备上,并设计实现了一个可定制文件放置与迁移策略的文件系统HybridFS,,通过监控分析脚本可以对同一文件系统中不同特点的文件选择不同的放置与迁移策略。针对键值数据库的文件访问特点,日志类文件多是一次写很少读,所以直接写入磁盘。元数据类文件读写都很频繁但是文件大小和数量偏少,因此写入固态盘。对数据文件由于其海量、定长、一次写多次读的特点,根据负载特点选择性的写入磁盘或者固态盘。针对纯写入型负载使用概率选择的方式比Flashcache提升了5%-56%,而对于读写混合行负载使用LRU的迁移方式相比Flashcache有4%-14%的性能提升。
[Abstract]:With the rapid development of large-scale Internet applications, the scalability of storage system is higher. Because of its simple and efficient data model, key-valued database has great advantages over traditional database system in scalability. On the other hand, with the development of hardware, the performance-price ratio of solid disk has been gradually improved, which has become the first choice of more and more systems, and its advantages of random reading are highly consistent with the characteristics of Internet applications. However, the relatively low writing performance and the limitation of erasure times affect its application scenario, so the fusion storage system which combines solid-state disk and disk has been paid more and more attention. Therefore, the fusion storage system designed for key-valued database is a valuable combination point. According to the load characteristics of Web application, the operation of key value database is recorded sequentially by logging, and all the operations are only written to memory sequencely. when the data in memory reaches a certain threshold, it is brushed to the back end storage once. The way of logging is to optimize writing performance at the expense of reading performance, so it is naturally complementary to solid-state disk. Because of the different characteristics of back-end solid-state disk and disk, the characteristics of writing are classified and written to different devices by hierarchical method, and a file system HybridFS, with customizable file placement and migration strategy is designed and implemented. Different placement and migration strategies can be selected for files with different characteristics in the same file system by monitoring and analyzing scripts. According to the file access characteristics of key-valued database, log class files are mostly written and rarely read at a time, so they are written directly to disk. Metadata files are read and written frequently, but the size and number of files are small, so they are written to solid-state disks. The data file is selectively written to disk or solid state disk according to the characteristics of load because of its mass, fixed length and multiple reading at a time. The choice of usage probability for pure write load is 5% 鈮
本文编号:2492413
[Abstract]:With the rapid development of large-scale Internet applications, the scalability of storage system is higher. Because of its simple and efficient data model, key-valued database has great advantages over traditional database system in scalability. On the other hand, with the development of hardware, the performance-price ratio of solid disk has been gradually improved, which has become the first choice of more and more systems, and its advantages of random reading are highly consistent with the characteristics of Internet applications. However, the relatively low writing performance and the limitation of erasure times affect its application scenario, so the fusion storage system which combines solid-state disk and disk has been paid more and more attention. Therefore, the fusion storage system designed for key-valued database is a valuable combination point. According to the load characteristics of Web application, the operation of key value database is recorded sequentially by logging, and all the operations are only written to memory sequencely. when the data in memory reaches a certain threshold, it is brushed to the back end storage once. The way of logging is to optimize writing performance at the expense of reading performance, so it is naturally complementary to solid-state disk. Because of the different characteristics of back-end solid-state disk and disk, the characteristics of writing are classified and written to different devices by hierarchical method, and a file system HybridFS, with customizable file placement and migration strategy is designed and implemented. Different placement and migration strategies can be selected for files with different characteristics in the same file system by monitoring and analyzing scripts. According to the file access characteristics of key-valued database, log class files are mostly written and rarely read at a time, so they are written directly to disk. Metadata files are read and written frequently, but the size and number of files are small, so they are written to solid-state disks. The data file is selectively written to disk or solid state disk according to the characteristics of load because of its mass, fixed length and multiple reading at a time. The choice of usage probability for pure write load is 5% 鈮
本文编号:2492413
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2492413.html