Infoscience

Thesis

Binaural localization and separation techniques

Abstract Based on binaural signals, i.e. the signals observed at the two ears, a listener can localize and recognize different sound sources and then focus on one of these. For decades, researchers have tried to invent a machine that can do the same under similar conditions. Despite all the efforts, the human auditory system is, by far, superior to any machine that has been devised. The topic of this thesis is computational techniques for the localization and separation of sources in binaural signals. In order to give an overview of different areas of research that have considered the problems of source localization and separation, we start with a review of existing techniques. This provides the background for the techniques that we propose subsequently. Binaural Localization The most important cues for localization of sound sources in binaural signals are the level and time differences between the ears. We propose a technique for the joint evaluation of these cues where noisy level difference estimates are combined with less noisy but ambiguous time difference estimates in order to provide accurate azimuth estimates. The proposed technique enables the localization of sources and the tracking of these in dynamic scenes. Head model Based on a study of the level and time differences as function of azimuth angle for different heads, we propose a generic model that is parametrized by the distance between the ears only. This enables the use of the binaural localization technique mentioned above for a listener whose head related transfer functions have not been measured. Binaural separation For the separation of sources we propose a method based on spatial windowing in the azimuth parameter space. Separation of overlapping partials Finally, we propose a technique for the separation of overlapping partials in mixtures of harmonic instruments. The technique is based on the similarity of temporal envelopes between the different partials of a harmonic note.

    Thèse École polytechnique fédérale de Lausanne EPFL, n° 3043 (2004)
    Section des systèmes de communication
    Faculté informatique et communications
    Institut de systèmes de communication
    Laboratoire de communications audiovisuelles 1
    Jury: Andrzej Drygajlo, Aki Haermae, Stefan Launer, Emre Telatar

    Public defense: 2004-8-30

    Reference

    Record created on 2005-03-16, modified on 2016-08-08

Related material