Files

Résumé

To reduce energy consumption, it is possible to operate embedded systems at sub-nominal conditions (e.g., reduced voltage, limited eDRAM refresh rate) that can introduce bit errors in their memories. These errors can affect the stored values of CNN weights and activations, compromising their accuracy. In this paper, we introduce Embedded Ensemble CNNs (E2CNNs), our architectural design methodology to conceive ensembles of convolutional neural networks to improve robustness against memory errors compared to a single-instance network. Ensembles of CNNs have been previously proposed to increase accuracy at the cost of replicating similar or different architectures. Unfortunately, SoA ensembles do not suit well embedded systems, in which memory and processing constraints limit the number of deployable models. Our proposed architecture solves that limitation applying SoA compression methods to produce an ensemble with the same memory requirements of the original architecture, but with improved error robustness. Then, as part of our new E2CNNs design methodology, we propose a heuristic method to automate the design of the voter-based ensemble architecture that maximizes accuracy for the expected memory error rate while bounding the design effort. To evaluate the robustness of E2CNNs for different error types and densities, and their ability to achieve energy savings, we propose three error models that simulate the behavior of SRAM and eDRAM operating at sub-nominal conditions. Our results show that E2CNNs achieves energy savings of up to 80% for LeNet-5, 90% for AlexNet, 60% for GoogLeNet, 60% for MobileNet and 60% for an optimized industrial CNN, while minimizing the impact on accuracy. Furthermore, the memory size can be decreased up to 54% by reducing the number of members in the ensemble, with a more limited impact on the original accuracy than obtained through pruning alone.

Détails

PDF