Files

Abstract

Energy consumption is a significant obstacle to integrate deep learning into edge devices. Two common techniques to curve it are quantization, which reduces the size of the memories (static energy) and the number of accesses (dynamic energy), and voltage scaling. However, SRAMs are prone to failures when operating at sub-nominal voltages, hence potentially introducing errors in computations. In this paper we first analyze the resilience of AI based methods for edge devices---in particular CNNs---to SRAM errors when operating at reduced voltages. Then, we compare the relative energy savings introduced by quantization and voltage scaling, both separately and together. Our experiments with an industrial use case confirm that CNNs are quite resilient to bit errors in the model, particularly for fixed-point implementations (5.7% accuracy loss with an error rate of 0.0065 errors per bit). Quantization alone can lead to savings of up to 61.3% in the dynamic energy consumption of the memory subsystem, with an additional reduction of up to11.0% introduced by voltage scaling; all at the price of a 13.6% loss in accuracy.

Details

PDF