GPU-accelerated data management under the test of time

Raza, Syed Mohammad Aunn; Chrysogelos, Periklis; Sioulas, Panagiotis; Indjic, Vladimir; Anadiotis, Angelos Christos; Ailamaki, Anastasia

conference paper

GPU-accelerated data management under the test of time

Raza, Syed Mohammad Aunn

•

Chrysogelos, Periklis

•

Sioulas, Panagiotis

more

2020

Online proceedings of the 10th Conference on Innovative Data Systems Research (CIDR)

10th Conference on Innovative Data Systems Research (CIDR)

GPUs are becoming increasingly popular in large scale data center installations due to their strong, embarrassingly parallel, processing capabilities. Data management systems are riding the wave by using GPUs to accelerate query execution, mainly for analytical workloads. However, this acceleration comes at the price of a slow interconnect which imposes strong restrictions in bandwidth and latency when bringing data from the main memory to the GPU for processing. The related research in data management systems mostly relies on late materialization and data sharing to mitigate the overheads introduced by slow interconnects even in the standard CPU processing case. Finally, workload trends move beyond analytical to fresh data processing, typically referred to as Hybrid Transactional and Analytical Processing (HTAP). Therefore, we experience an evolution in three different axes: interconnect technology, GPU architecture, and workload characteristics. In this paper, we break the evolution of the technological landscape into steps and we study the applicability and performance of late materialization and data sharing in each one of them. We demonstrate that the standard PCIe interconnect substantially limits the performance of state-of-the-art GPUs and we propose a hybrid materialization approach which combines eager with lazy data transfers. Further, we show that the wide gap between GPU and PCIe throughput can be bridged through efficient data sharing techniques. Finally, we provide an H2TAP system design which removes software-level interference and we show that the interference in the memory bus is minimal, allowing data transfer optimizations as in OLAP workloads.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/168156

Name

p18-raza-cidr20.pdf

Type

Publisher's version

Access type

openaccess

License Condition

CC BY-NC-ND

Size

353.83 KB

Format

Adobe PDF

Checksum (MD5)

1972cdd9c8c603a43c8bf5450e3bdb04