Caching All Plans with Just One Optimizer Call

Modern database management systems (DBMS) answer a multitude of complex queries on increasingly larger datasets. Given the complexities of the queries and the numerous design features, manual design is no longer an option. Instead, automatically designing the database is vital to maximize its performance and to reduce the total cost of ownership. For this purpose, commercial DBMS feature automated physical designers suggesting an efficient DB design by using the optimizer as a cost model. Unfortunately, consulting the optimizer is timeconsuming, an effect which is typically counter-acted by drastically pruning the search space, thereby potentially missing the optimal solution. Recently techniques cache the optimizer’s output and evaluate some plans with the cached results, reducing the number of calls to the optimizer. Still, however, the cost of invoking the optimizer to fill the cache is nontrivial, undermining scalability when running workloads with thousands of queries. In this paper, we use the intermediate optimization results in a dynamic programming based optimizer to reduce the cache initialization overhead. We demonstrate the accuracy and efficiency of our techniques by implementing them on the PostgreSQL open source query optimizer. For a star-schema workload, our techniques build the cost model 5 to 10 times faster than the conventional approach, while preserving accuracy.

Published in:
Proceedings of 5th International Workshop on Self Managing Database Systems (SMDB 2010)
Presented at:
5th International Workshop on Self Managing Database Systems (SMDB 2010), Long Beach, CA, USA, 1-6 March 2010

 Record created 2010-09-30, last modified 2018-01-28

External link:
Download fulltext
Rate this document:

Rate this document:
(Not yet reviewed)