HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines

Modern server hardware is increasingly heterogeneous as hardware accelerators, such as GPUs, are used together with multicore CPUs to meet the computational demands of modern data analytics workloads. Unfortunately, query parallelization techniques used by analytical database engines are designed for homogeneous multicore servers, where query plans are parallelized across CPUs to process data stored in cache coherent shared memory. Thus, these techniques are unable to fully exploit available heterogeneous hardware, where one needs to exploit task-parallelism of CPUs and data-parallelism of GPUs for processing data stored in a deep, non-cache-coherent memory hierarchy with widely varying access latencies and bandwidth. In this paper, we introduce HetExchange–a parallel query execution framework that encapsulates the heterogeneous parallelism of modern multi-CPU–multi-GPU servers and enables the parallelization of (pre-)existing sequential relational operators. In contrast to the interpreted nature of traditional Exchange, HetExchange is designed to be used in conjunction with JIT compiled engines in order to allow a tight integration with the proposed operators and generation of efficient code for heterogeneous hardware. We validate the applicability and efficiency of our design by building a prototype that can operate over both CPUs and GPUs, and enables its operators to be parallelism- and data-location-agnostic. In doing so, we show that efficiently exploiting CPU–GPU parallelism can provide 2.8x and 6.4x improvement in performance compared to state-of-the-art CPU-based and GPU-based DBMS.

Published in:
Proceedings of the VLDB Endowment
Presented at:
45th International Conference on Very Large Data Bases, Los Angeles, California, USA, August 26-30, 2019

 Record created 2018-12-18, last modified 2019-03-17

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)