Spiking neural networks for sound localization: A new perspective on illuminating auditory spatial perception
Humans estimate sound source direction using information from their auditory neural system. Traditional methods use auditory cues [e.g., interaural time differences (ITDs) and interaural level differences (ILDs), etc] to perform sound localization. These cues are extracted from binaural signals or decoded from neuronal firing rates. In contrast, we proposed a new computational model that directly localizes sound sources using the firing rates of auditory neurons, eliminating the need for physical cue extraction and the template-matching process. This model incorporates spiking neural networks (SNNs) and artificial neural networks (ANNs) to emulate auditory spatial perception. To get firing rates, the SNN uses auditory peripheral processing and physiological models of the cochlear nucleus and medial superior olive (MSO). The SNN calculates a database of firing rates from sine tones across varying positions and frequencies to train the ANN. The ANN performs nonlinear regression to predict azimuth and elevation angles, accommodating both narrowband and broadband signals. The integration of dynamic cues resolves front–back confusion by aligning with human auditory perception. We conducted a localization listening test with 10 participants with normal hearing, enabling the refinement of network parameters to closely mimic human behavior. In the future, hearing loss can be simulated by adjusting parameters related to inner hair cell dysfunction, thereby providing a robust framework for real-world spatial hearing.
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
2025-04-01
157
4_Supplement
A115
A115
REVIEWED
EPFL