Abstracting Multi-Core Topologies with MCTOP

Portability and efficiency are usually antagonists in multi-core computing. In order to develop efficient code, one needs to take into account the topology of the target multi-cores (e.g., for locality). This clearly hampers code portability. In this paper, we show that you can have the cake and eat it too. We introduce MCTOP, an abstraction of multi-core topologies augmented with important low-level hardware information, such as memory bandwidths and communication latencies. We show how to automatically generate MCTOP using libmctop, our library that leverages the determinism of cache-coherence protocols to infer the topology of multi-cores using only latency measurements. MCTOP enables developers to accurately and portably define high-level performance optimization policies. We illustrate several such policies through four examples: (i-ii) thread placement in OpenMP and in a MapReduce library, (iii) a topology-aware mergesort algorithm, as well as (iv) automatic backoff schemes for locks. We illustrate the portability of these optimizations on five processors from Intel, AMD, and Oracle, with low effort.

Published in:
Proceedings of the Twelfth European Conference on Computer Systems - EuroSys '17, 544-559
Presented at:
Twelfth European Conference on Computer Systems (EuroSys '17), Belgrade, Serbia, April 23-26, 2017
New York, NY, USA, ACM Press

Note: The status of this file is: Anyone

 Record created 2017-04-20, last modified 2020-07-29

Publisher's version:
Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)