LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

Parthasarathi, Sree Hari Krishnan; Bourlard, Hervé; Gatica-Perez, Daniel

doi:10.21437/Interspeech.2011-390

conference paper

LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

Parthasarathi, Sree Hari Krishnan

•

Bourlard, Hervé

•

Gatica-Perez, Daniel

2011

Interspeech 2011

Interspeech

We present a comprehensive study of linear prediction residual for speaker diarization on single and multiple distant microphone conditions in privacy-sensitive settings, a requirement to analyze a wide range of spontaneous conversations. Two representations of the residual are compared, namely real-cepstrum and MFCC, with the latter performing better. Experiments on RT06eval show that residual with subband information from 2.5 kHz to 3.5 kHz and spectral slope yields a performance close to traditional MFCC features. As a way to objectively evaluate privacy in terms of linguistic information, we perform phoneme recognition. Residual features yield low phoneme accuracies compared to traditional MFCC features.

Name

Parthasarathi_INTERSPEECH_2011.pdf

Access type

openaccess

Size

63.98 KB

Format

Adobe PDF

Checksum (MD5)

12ae5a239037007555aee2fa7dd39f98