On the Performance of Delegation over Cache-Coherent Shared Memory

Delegation is a thread synchronization technique where access to shared data is performed through a dedicated server thread. When a client thread requires shared data access, it makes a request to a server and waits for a response. This paper studies delegation implementation over cache-coherent shared-memory, with the goal of optimizing it for high throughput. Whereas client-server communication naturally fits message-passing systems, efficient implementation over cache-coherent shared memory requires careful optimization. We demonstrate optimizations that significantly improve delegation performance on two modern x86 processors (the Intel Xeon Westmere and the AMD Opteron Magny-Cours), enabling us to come up with counter, stack and queue implementations that outperform the best known alternatives in a large number of cases. Our optimized delegation solution achieves 1.4x (resp. 2x) higher throughput compared to the most efficient state-of-the-art delegation solution on the Intel Xeon (resp. AMD Opteron).

Presented at:
16th International Conference on Distributed Computing and Networking (ICDCN), Goa, India, January 4-7, 2015

 Record created 2014-10-02, last modified 2020-01-23

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)