Journal article

Timing-Error-Tolerant Network-on-Chip Design Methodology

With technology scaling, the wire delay as a fraction of the total delay is increasing, and the communication architecture is becoming a major bottleneck for system performance in systems on chip (SoCs). A communication-centric design paradigm, networks on chip (NoCs), has been proposed recently to address the communication issues of SoCs. As the geometries of devices approach the physical limits of operation, NoCs will be susceptible to various noise sources such as crosstalk, coupling noise, process variations, etc. Designing systems under such uncertain conditions become a challenge, as it is harder to predict the timing behavior of the system. The use of conservative design methodologies that consider all possible delay variations due to the noise sources, targeting safe system operation under all conditions will result in poor system performance. An aggressive design approach that provides resilience against such timing errors is required for maximizing system performance. In this paper, we present T-error, which is a timing-error-tolerant aggressive design method to design the individual components of the NoC (such as switches, links, and network interfaces), so that the communication subsystem can be clocked at a much higher frequency than a traditional conservative design (up to 1.5$times$ increase in frequency). The NoC is designed to tolerate timing errors that arise from overclocking without substantially affecting the latency for communication. We also present a way to dynamically configure the NoC between the overclocked mode and the normal mode, where the frequency of operation is lower than or equal to the traditional design's frequency, so that the error recovery penalty is completely hidden under normal operation. Experiments on several benchmark applications show large performance improvement (up to 33% reduction in average packet latency) for the proposed system when compared to traditional systems.

Related material