Scalable parallelization of skyline computation for multi-core processors

The skyline is an important query operator for multi-criteria decision making. It reduces a dataset to only those points that offer optimal trade-offs of dimensions. In general, it is very expensive to compute. Recently, multicore CPU algorithms have been proposed to accelerate the computation of the skyline. However, they do not sufficiently minimize dominance tests and so are not competitive with state-of-the-art sequential algorithms. In this paper, we introduce a novel multicore skyline algorithm, Hybrid, which processes points in blocks. It maintains a shared, global skyline among all threads, which is used to minimize dominance tests while maintaining high throughput. The algorithm uses an efficiently-updatable data structure over the shared, global skyline, based on point-based partitioning. Also, we release a large benchmark of optimized skyline algorithms, with which we demonstrate on challenging workloads a 100-fold speedup over state-of-the-art multicore algorithms and a 10-fold speedup with 16 cores over state-of-the-art sequential algorithms.

Published in:
2015 IEEE 31st International Conference on Data Engineering, 1083-1094
Presented at:
2015 IEEE 31st International Conference on Data Engineering (ICDE), Seoul, South Korea, 13-17 April 2015

 Record created 2015-08-20, last modified 2018-03-17

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)