Implementing a Cocktail-party Processor via Time-frequency Masking
The ability of human auditory systems to focus on one signal and ignore other signals in an auditory scene where several auditory events are taking place, often referred to as cocktail-party effect, is a key to localization of sound sources. This ability is partly made possible by interaural cues – Interaural Time Differences (ITDs) and Interaural Level Differences (ILDs) – between the input ear signals that assist the estimation of source azimuth angles, and separation of the signal of the desired direction from signals of non-desire directions. In this paper, we investigate simplified techniques to source separation of sound sources based on inter-channel cues. Particular emphasis is put on the selection of time-frequency masks and its effects on the quality of source separation.
COM415_proj_report_Tao_Benedikt_2011_1212.pdf
openaccess
228.28 KB
Adobe PDF
9bd22bc93cd7dbc87e00273dca052d0a
COM415_proj_spaa_Tao_Benedikt_2011_1211.zip
openaccess
1.69 MB
ZIP
218fd8049fad0a85a4c6d9ef02f0c6d5