High-performance deep spiking neural networks with 0.3 spikes per neuron

Stanojevic, Ana; Wozniak, Stanislaw Andrzej; Bellec, Guillaume; Cherubini, Giovanni; Pantazi, Angeliki; Gerstner, Wulfram

doi:10.1038/s41467-024-51110-5

research article

High-performance deep spiking neural networks with 0.3 spikes per neuron

Stanojevic, Ana

•

Wozniak, Stanislaw Andrzej

•

Bellec, Guillaume

August 9, 2024

Nature Communications

Communication by rare, binary spikes is a key factor for the energy efficiency of biological brains. However, it is harder to train biologically-inspired spiking neural networks than artificial neural networks. This is puzzling given that theoretical results provide exact mapping algorithms from artificial to spiking neural networks with time-to-first-spike coding. In this paper we analyze in theory and simulation the learning dynamics of time-to-first-spike-networks and identify a specific instance of the vanishing-or-exploding gradient problem. While two choices of spiking neural network mappings solve this problem at initialization, only the one with a constant slope of the neuron membrane potential at threshold guarantees the equivalence of the training trajectory between spiking and artificial neural networks with rectified linear units. For specific image classification architectures comprising feed-forward dense or convolutional layers, we demonstrate that deep spiking neural network models can be effectively trained from scratch on MNIST and Fashion-MNIST datasets, or fine-tuned on large-scale datasets, such as CIFAR10, CIFAR100 and PLACES365, to achieve the exact same performance as that of artificial neural networks, surpassing previous spiking neural networks. Our approach accomplishes high-performance classification with less than 0.3 spikes per neuron, lending itself for an energy-efficient implementation. We also show that fine-tuning spiking neural networks with our robust gradient descent algorithm enables their optimization for hardware implementations with low latency and resilience to noise and quantization.