On Lock-Free Work-stealing Iterators for Parallel Data Structures

With the rise of multicores, there is a trend of supporting data-parallel collection operations in general purpose programming languages. These operations are highly parametric, incurring abstraction performance penalties. Furthermore, data-parallel operations must scale when applied to irregular workloads. Work-stealing is a proven technique for load balancing irregular workloads, but general purpose work-stealing also suffers abstraction penalties. We present a generic data-parallel collections design based on work-stealing for shared-memory architectures that overcomes abstraction penalties through callsite specialization of data-parallel operation instances. Moreover, we introduce \textit{work-stealing iterators} that allow fine-grained and efficient work-stealing for particular data-structures. By eliminating abstraction penalties and making work-stealing data-structure-aware we achieve up to 60x better performance compared to JVM-based approaches and 3x speedups compared to tools such as Intel TBB.


Year:
2014
Keywords:
Laboratories:




 Record created 2014-02-13, last modified 2018-01-28

External link:
Download fulltext
n/a
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)