Ailamaki, AnastasiaRaza, Aunn2024-08-132024-08-132024-08-132024-09-0210.5075/epfl-thesis-9830https://infoscience.epfl.ch/handle/20.500.14299/240719Operational analytics enables real-time data analysis for actionable insights on operational data through hybrid transaction and analytical processing (HTAP). HTAP focuses on analytical data freshness, that is, query processing on fresh data generated by concurrent transaction processing. The efficiency of HTAP systems is driven by sharing fresh data from the transactional to the analytical part of the system while maintaining performance isolation between the individual workloads. However, executing mixed workloads - analytics and transactions - in a single integrated system has an inherent freshness-performance trade-off. Sharing computing and memory resources at the hardware level and sharing data structures for snapshot isolation causes destructive performance interference in HTAP. Mitigating performance interference through strict workload isolation trades either freshness for performance by processing stale data or performance for freshness because of snapshot latency for acquiring fresh data snapshots. HTAP systems typically optimize for freshness-performance trade-offs at system design time, presuming expected workload and data freshness requirements. Nevertheless, optimal design decisions depend on the volatile workload and freshness requirements, known only at runtime, thereby necessitating adaptivity. This thesis aims for runtime adaptivity in in-memory HTAP: adapting to workload- and freshness requirements at runtime. The main design choices governing the freshness-performance trade-off in HTAP are 1) workload scheduling, which balances the trade-off between performance isolation and data locality for processing fresh data, 2) snapshotting mechanisms, which define the fresh data access paths, thereby determining the trade-off between access path efficiency and, snapshot maintenance and storage costs. Regarding workload scheduling, we propose adaptive workload isolation. We encapsulate the HTAP design space in a continuous spectrum of scheduling states, ranging from colocated to isolated transactional and analytical processing. Through elastic resource scheduling, we then adapt workload isolation to the amount of fresh data access patterns. Regarding snapshotting, we introduce column-level snapshots-on-read, which provides concurrent snapshots with sequential access patterns for analytics without upfront storage overheads. Additionally, we alleviate the storage housekeeping costs for tuple-level, snapshot-on-write versioning, such as delta-store in multi-version concurrency control (MVCC), by introducing temporality-aware version storage, which eliminates fine-grained version maintenance and garbage collection overheads. Overall, we design adaptive HTAP that enables efficient operational analytics on fresh data. Instead of presuming workload access patterns and data freshness requirements at design time, this thesis embraces runtime adaptivity as a core design principle for scaling mixed workloads in data management systems. Specifically, adaptive HTAP elastically schedules workload and adapts fresh data access paths at runtime. As a result, our design provides flexibility and coverage for unpredictable workload- and freshness requirements by adapting system design decisions at runtime.enDatabase management systems (DBMS)Hybrid transactional and analytical processing (HTAP)Online analytical query processing (OLAP)Online transaction processing (OLTP)real-time analyticsoperational analyticsdata analyticsworkload schedulingdata snapshotsextract-transform-load (ETL)Efficient Operational Analytics on Fresh Datathesis::doctoral thesis