Files

Abstract

Database workloads have significantly evolved in the past twenty years. Traditional database systems that are mainly used to serve Online Transactional Processing (OLTP) workloads evolved into specialized database systems that are optimized for particular types of workloads. Data warehousing applications have led to Online Analytical Processing (OLAP) workloads and real-time analytical processing applications have led to Hybrid Transactional and Analytical Processing (HTAP) workloads. Similarly, modern hardware has significantly evolved in the past twenty years. Unicore, simple processors with megabytes of main memory have evolved into multi-core, power-limited processors with hundreds of gigabytes of main memory. Furthermore, the processors have complex micro-architectural features such as Single Instruction Multiple Data (SIMD) instructions and complex branch predictors. Advancements in processor technology have led to further evolution of database systems with novel system architectures and query processing paradigms. We present the micro-architectural behavior of modern database workloads on a modern processor for various categories of workloads and generations of database systems. We examine three main categories of database workloads and study them separately: OLTP, OLAP, and HTAP. We show that OLTP systems spend most of their execution time waiting for instruction-cache or data-cache misses, where the data-cache misses are due to the random data-accesses during the index lookup operation. While using an efficient index structure can significantly reduce the number of data-cache misses, the main micro-architectural bottleneck remains the data-cache misses due to the costly random data-accesses. Hence, OLTP systems should adopt techniques that mitigate the random data-accesses. OLAP systems spend most of their execution time in data-cache misses, where the data-cache misses are due to high pressure on the memory bandwidth for sequential-scan-heavy queries, and are due to random data-accesses for join-intensive queries. OLAP systems that follow tuple-at-a-time execution models efficiently use the CPU cycles. However, they require executing a significantly larger number of instructions hence are significantly slower than the systems that follow vector-at-a-time and compiled execution models. Therefore, OLAP systems should use efficient execution models and adopt techniques that mitigate data-cache misses. HTAP systems combine OLTP and OLAP systems into a single, unified system, where the OLTP and OLAP systems run on the same hardware and on the same data. Running on the same hardware results in hardware-level interference, where the OLTP throughput significantly drops due to the OLAP side sharing the hardware resources. Running on the same data results in increased OLAP query execution time due to the need for the OLAP side to process the fresh tuples generated by the OLTP side. Therefore, HTAP systems should adopt techniques that mitigate the hardware-level interference and should make sure that the OLAP side uses enough resources to minimize the increased query execution time.

Details

PDF