当前位置:主页 > 科技论文 > 计算机论文 >

基于GPU的多连接查询优化

发布时间:2018-09-09 20:14
【摘要】:随着信息时代的到来,数据处理的要求越来越高。一方面是数据更加复杂和数据量巨大膨胀,另一方面又要求数据处理的短时延和高吞吐量。传统数据库在单机平台上的串行处理方式已不能满足需要,并行处理是满足大数据处理需要的有效方法。而日渐发展的用于通用计算的图形处理器GPU以其超强的计算能力和存储器带宽,成为并行计算的有力工具,为加速数据处理提供了硬件支持。多连接查询是数据处理中最常见和最耗时的操作,多连接查询的效率是数据库性能的重要因素。因此,本文利用GPU这一硬件平台,研究、设计和实现了多连接操作的优化工作。在GPU上的多连接查询优化分为两个阶段,第一个阶段是建立连接的代价模型,采用启发式算法获取一棵代价最小的多连接查询树;第二个阶段是在这个最小代价的多连接查询树上,用GPU进行并行优化。GPU上并行优化不仅可以实现每个连接内部的并行优化,还可以实现各个连接间的并行优化。多种并行优化方式同时使用,才能充分利用GPU的并行处理能力,最大限度地提高多连接查询处理的性能。本文一是详细设计和实现了在GPU上的两种单连接的并行优化,即排序归并连接和哈希连接的并行优化,并分析比较了这两种连接的串行实现;二是讨论了连接间的并行调度策略,如顺序并行执行策略、分层并行执行策略和右深树执行策略,分析比较了这几种策略的优劣。本文最后,实验测试了排序归并连接和哈希连接算法在GPU与多核CPU上性能,结果表明基于GPU优化的排序归并连接和哈希连接算法性能优于多核CPU上的并行算法,加速比分别达到了7.25和5.21。同时测试了两种算法在GPU平台上利用不同的并行调度策略与基于多核CPU并行优化的多连接算法的性能,结果表明基于GPU优化的多连接算法性能要优于基于多核CPU并行优化的多连接算法。本文使用GPU来提高多连接查询操作的处理效率,得到了一定的效果,为进一步提高数据库管理效率提供有效保障。
[Abstract]:With the advent of the information age, the requirement of data processing is getting higher and higher. On the one hand, the data is more complex and the data volume is expanding enormously, on the other hand, the data processing needs short time delay and high throughput. GPU is a powerful tool for parallel computing and provides hardware support for speeding up data processing. Multi-join query is the most common and time-consuming operation in data processing. The efficiency of multi-join query is the database performance. The optimization of multi-join query on GPU is divided into two stages. The first stage is to establish the cost model of the connection, and to obtain a minimum cost multi-join query tree using heuristic algorithm; the second stage is in the GPU. This minimal cost multi-join query tree is optimized in parallel with GPU. Parallel optimization of GPU can not only realize the parallel optimization within each connection, but also realize the parallel optimization between each connection. In this paper, we first design and implement two kinds of parallel optimization of single connection on GPU, namely, parallel optimization of sorted merge connection and hash connection, and analyze and compare the serial implementation of these two kinds of connections. Secondly, we discuss the parallel scheduling strategies between connections, such as sequential parallel execution strategy and hierarchical parallel execution strategy. Finally, the performance of sorted merge join and hash join algorithm on GPU and multi-core CPU is tested. The results show that the performance of sorted merge join and hash join algorithm based on GPU optimization is better than that of parallel algorithm on multi-core CPU, and the speedup ratio is 7.25 respectively. And 5.21. The performance of the two algorithms on GPU platform using different parallel scheduling strategies and multi-core CPU parallel optimization is tested. The results show that the performance of the multi-join algorithm based on GPU optimization is better than that based on multi-core CPU parallel optimization. The efficiency of the system has achieved certain results, which provides an effective guarantee for further improving the efficiency of database management.
【学位授予单位】:华南理工大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP338.6

【相似文献】

相关期刊论文 前10条

1 徐帆;汇总型多表连接查询的一种优化方法[J];计算机工程与设计;2002年10期

2 张雷;唐桂芬;苏冉冉;;基于通用空间连接图的适应性多元空间连接查询[J];计算机光盘软件与应用;2013年13期

3 彭建平,王变琴;再探多连接查询优化方法[J];中山大学学报(自然科学版);2001年02期

4 刘宇,孙莉,田永青;并行空间连接查询处理[J];上海交通大学学报;2002年04期

5 王果,徐仁佐;结合哈希过滤的一种改进多连接查询优化算法[J];计算机工程;2004年07期

6 陈恕胜;刘卫东;;基于图的适应性多连接查询优化算法[J];计算机工程;2009年10期

7 郭聪莉;朱莉;李向;;基于蚁群算法的多连接查询优化方法[J];计算机工程;2009年10期

8 王,

本文编号:2233454


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2233454.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户3a6fc***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com