Files

Abstract

We study the sparsity of spectro-temporal representation of speech in reverberant acoustic conditions. This study motivates the use of structured sparsity models for efficient recovery of speech. We formulate the underdetermined convolutive speech separation in spectro-temporal domain as the sparse signal recovery where we leverage model-based recovery algorithms. To tackle the ambiguity of the real acoustics, we exploit the Image Model of the enclosures to estimate the room impulse response function through a structured sparsity constraint optimization. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech applications.

Details

PDF