TurboTag: Lookup Filtering to Reduce Coherence Directory Power

On-chip coherence directories of today’s multi-core systems are not energy efficient. Coherence directories dissipate a significant fraction of their power on unnecessary lookups when running commercial server and scientific workloads. These workloads have large working sets that are beyond the reach of on-chip caches of modern processors. Limited to capturing a small part of the working set, private caches retain cache blocks only for a short period of time before replacing them with new blocks. Moreover, coherence enforcement is a known performance bottleneck of multi-threaded software, hence data-sharing in optimized high-performance software is minimal. Consequently, the majority of the accesses to the coherence directory find no sharers in the directory because the data are not available in the on-chip private caches, effectively wasting power on the coherence checks. To improve energy-efficiency for future many-core systems, we propose TurboTag, a filtering mechanism to eliminate needless directory lookups. We analyze full-system traces of server and scientific workloads and find that over 69% of accesses to the directory find no sharers and can be entirely avoided. Taking advantage of this behavior, TurboTag achieves a 58% reduction in the directory’s dynamic power consumption.

Publié dans:
Proceedings of the 16th International Symposium on Low Power Electronics and Design (ISLPED 10), 377-382
Présenté à:
16th International Symposium on Low Power Electronics and Design (ISLPED 10), Austin, Texas, USA, August 18-20
New York, NY, USA, ACM

 Notice créée le 2010-06-29, modifiée le 2019-08-12

Télécharger le document

Évaluer ce document:

Rate this document:
(Pas encore évalué)