Files

Abstract

Manycore chips are emerging as the architecture of choice to provide power-scalability and improve performance while riding the Moore’s law. On-chip interconnects are increasingly playing a pivotal role in power- and performance- scalability of such microarchitectures. As supply voltages begin to level off in future technologies, chip designs in general and interconnects in particular are resorting to specialization to provide power- and performance-scalability. In this paper, we make the observation that cache-coherent manycore chips exhibit a duality in on-chip network traffic. Request traffic typically consists of control packets requiring narrow low-power switches, while response traffic often carries cache block-sized payloads that require wider and higher-power switches. We present Cache-Coherence Network-on-Chip (CCNoC), a design to capitalize on this duality in traffic and provide a pair of asymmetric switches that optimize power and performance over conventional onchip interconnects. Cycle-accurate simulation results for a 4x4 chip multiprocessor with a shared last-level cache running commercial server workloads indicate 22% improvement in power over a torus and 38% improvement in power over a mesh with larger channel width, while providing similar performance.

Details

PDF