Implementing a Cocktail-party Processor via Time-frequency Masking
The ability of human auditory systems to focus on one signal and ignore other signals in an auditory scene where several auditory events are taking place, often referred to as cocktail-party effect, is a key to localization of sound sources. This ability is partly made possible by interaural cues – Interaural Time Differences (ITDs) and Interaural Level Differences (ILDs) – between the input ear signals that assist the estimation of source azimuth angles, and separation of the signal of the desired direction from signals of non-desire directions. In this paper, we investigate simplified techniques to source separation of sound sources based on inter-channel cues. Particular emphasis is put on the selection of time-frequency masks and its effects on the quality of source separation.