Files

Abstract

Chip-multiprocessors require a coherence directory to track data sharing and order accesses to the shared data. Scaling coherence directories to support a large number of cores is challenging due to excessive area requirements of the directories. The state-of-the-art proposals reduce the directory size by not keeping coherence information for private data. These approaches are useful for workloads that have predominantly private data, but are not applicable to workloads with shared data. We observe that data are not actively shared by multiple cores. In workloads with a shared dataset, although each core accesses the whole data, the chance that multiple cores access the same piece of data at the same time is low. Based on this observation we design a Spatiotemporal Coherence Tracking scheme that drastically reduces the directory size without sacrificing performance. The proposed directory scheme uses dual-grain tracking and switches between the granularities whenever possible to save the area. It dynamically detects spatial regions of data that are privately accessed by one core over a time period and for those regions, increases coherence tracking granularity from block-level to region-level. Our experimental results show that the proposed approach can reduce the baseline sparse directory size by at least 75% across a variety of commercial and scientific workloads, while sacrificing only 1% of performance. Using our approach, the directory can be under-provisioned to have fewer entries than the number of cache blocks that are being tracked.

Details

Actions