Over the past 40 years, hard disks, the traditional building block of storage systems, have become increasingly slower when compared to the other system components such as the main memory and the CPU. Hard disks face mechanical constraints that cause their I/O bandwidth to lag behind capacity growth, while the access latency has remained virtually unchanged for the past 20 years. The growing gap between the main memory and the persistent storage brought us to a point where I/O accesses are the main bottleneck that limits the performance of database management systems. Several new solid-state storage technologies have been under development and are now commercially successful, with NAND flash memory being the most mature. Such technologies store data durably, have no mechanical constraints to limit their I/O performance, and can bridge the growing gap between the main memory and the persistent storage. Solid-state drives, however, have very different characteristics compared to hard disks, making it challenging to integrate such technology. For decades, database management systems have been implicitly developed and optimized around the model of a slow rotating hard disk. In this thesis, we focus on transactional database workloads composed of large numbers of small concurrent queries. Transactional workloads generate large numbers of scattered I/O requests at the level of the storage system that are traditionally challenging for hard disks to service. Solid-state drives, on the other hand, are a much better suited technology as they service natively I/O requests in parallel and their performance is not constrained by access locality. The approach we take in this thesis is to traverse the I/O stack of a database management system, i.e. the major software layers that define how data is stored and accessed, and show how these components can, with careful consideration of flash behavior and through novel algorithms, make efficient use of flash-based storage devices. The methodology we use is to observe the data access patterns and the data access frequencies at each level of the I/O stack and design new algorithms and data structures that match the characteristics of the workload with the constraints of flash memory. We use a bottom up approach to revisit the Storage Device, the Storage Manager, and the Caching layers of a data management system and show how each component can better exploit flash technology. At the level of the Storage Device, we show how a solid-state drive can minimize the overhead of storing data on the raw flash chips by exploiting the write skew present in real-life transactional workloads. We then introduce a log-structured data placement policy for the Storage Manager of a database system that can transparently adapt the data access pattern to eliminate all random I/O writes that are problematic for flash solid-state drives. Finally, we propose an architecture for the Caching layer that allows database systems to efficiently move data between main memory and a fast solid-state drive without having to incur the large overhead and limitations of the traditional buffer pool design.
EPFL_TH6121.pdf
restricted
7.31 MB
Adobe PDF
0936db5ae42d391d00eca77536d5a8dc