Running Efficiently CNNs on the Edge Thanks to Hybrid SRAM-RRAM In-Memory Computing

Rios, Marco Antonio; Ponzina, Flavio; Ansaloni, Giovanni; Levisse, Alexandre Sébastien Julien; Atienza Alonso, David

doi:10.23919/DATE51398.2021.9474233

conference paper

Running Efficiently CNNs on the Edge Thanks to Hybrid SRAM-RRAM In-Memory Computing

•

•

February 1, 2021

2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)

DATE 2021 Design, Automation and Test in Europe Conference

The increasing size of Convolutional Neural Networks (CNNs) and the high computational workload required for inference pose major challenges for their deployment on resource-constrained edge devices. In this paper, we address them by proposing a novel In-Memory Computing (IMC) architecture. Our IMC strategy allows us to efficiently perform arithmetic operations based on bitline computing, enabling a high degree of parallelism while reducing energy-costly data transfers. Moreover, it features a hybrid memory structure, where a portion of each subarray, dedicated to storing CNN weights, is implemented as high-density, zero-standby-power Resistive RAM. Finally, it exploits an innovative method for storing quantized weights based on their value, named Weight Data Mapping (WDM), which further increases efficiency. Compared to state-of-the-art IMC alternatives, our solution provides up to 93% improvements in energy efficiency and up to 6x less run-time when performing inference on Mobilenet and AlexNet neural networks.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/173734

Name

DATE2021_Rios_cameraready.pdf

Type

Preprint

Access type

openaccess

License Condition

Copyright

Size

1.71 MB

Format

Adobe PDF

Checksum (MD5)

b34ea2efde194ad317c6af5356672c18