Journal article

Scalable and Dynamically Balanced Shared-Everything OLTP with Physiological Partitioning

Scaling the performance of shared-everything transaction processing systems to highly-parallel multicore hardware remains a challenge for database system designers. Recent proposals alleviate locking and logging bottlenecks in the system, leaving page latching as the next potential problem. To tackle the page latching problem, we propose physiological partitioning (PLP). The PLP design applies logical-only partitioning, maintaining the desired properties of shared-everything designs, and introduces a multi-rooted B+Tree index structure (MRBTree) which enables the partitioning of the accesses at the physical page level. Logical partitioning and MRBTrees together ensure that all accesses to a given index page come from a single thread and, hence, can be entirely latch-free; an extended design makes heap page accesses thread-private as well. Eliminating page latching allows us to simplify key code paths in the system such as B+Tree operations leading to more efficient and maintainable code. Profiling a prototype PLP system running on different multicore machines shows that it acquires 85% and 68% fewer contentious critical sections, respectively, than an optimized conventional design and one based on logical-only partitioning. PLP also improves performance up to 40% and 18%, respectively, over the existing systems. Although partitioning is an increasingly popular solution for scaling up the performance of database management systems even within a single (multicore or multisocket) machine, it is not the panacea since there are many challenges associated with it. Therefore, in this paper, we also focus on one of the most troublesome challenges for partitioning-based transaction processing systems, which is their behavior in skewed and dynamically changing workloads. We present experimental results that show the non-optimal performance of a PLP transaction processing system and discuss challenges toward robust and efficient dynamic load balancing mechanisms for such systems. Then, we propose a dynamic load balancing mechanism and integrate it with our PLP system. Evaluation results show that the overhead of the mechanism is low in normal operation (in the worst case at most 8%) and it enhances the system with robust behavior, while achieving very low response times in both detecting and handling load imbalances.

Related material