A 16-bit Floating-Point Near-SRAM Architecture for Low-power Sparse Matrix-Vector Multiplication

Eggermann, Grégoire Axel; Rios, Marco Antonio; Ansaloni, Giovanni; Atienza Alonso, David

doi:10.1109/VLSI-SoC57769.2023.10321838

conference paper

A 16-bit Floating-Point Near-SRAM Architecture for Low-power Sparse Matrix-Vector Multiplication

Eggermann, Grégoire Axel

•

Rios, Marco Antonio

•

Ansaloni, Giovanni

October 18, 2023

2023 Ifip/Ieee 31St International Conference On Very Large Scale Integration, Vlsi-Soc

31st IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC)

State-of-the-art Artificial Intelligence (AI) algorithms, such as graph neural networks and recommendation systems, require floating-point computation of very large matrix multiplications over sparse data. Their execution in resource-constrained scenarios, like edge AI systems, requires a) careful optimization of computing patterns, leveraging sparsity as an opportunity to lower computational requirements, and b) using dedicated hardware. In this paper, we introduce a novel near-memory floating-point computing architecture dedicated to the parallel processing of sparse matrix-vector multiplication (SpMV). This architecture can be integrated at the periphery of memory arrays to exploit the inherent parallelism of memory structures to speed up computation. In addition, it uses its proximity to memory to achieve high computational capability and very low latency. The illustrated implementation, operating at 1GHz, can compute up to 370 MFLOPS (millions of floating-point operations per second) while computing SpMV multiplications, while incurring a modest 17% area overhead when interfaced with a 4KB SRAM array.

Name

2023201915.pdf

Type

Postprint

Version

http://purl.org/coar/version/c_ab4af688f83e57aa

Access type

openaccess

License Condition

copyright

Size

898.06 KB

Format

Adobe PDF

Checksum (MD5)

7f4ec377cdee7d9fc0edbbc0e3d88653