LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

Parthasarathi, Sree Hari Krishnan; Bourlard, Hervé; Gatica-Perez, Daniel

doi:10.21437/Interspeech.2011-390

Parthasarathi, Sree Hari Krishnan; Bourlard, Hervé; Gatica-Perez, Daniel

2011

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We present a comprehensive study of linear prediction residual for speaker diarization on single and multiple distant microphone conditions in privacy-sensitive settings, a requirement to analyze a wide range of spontaneous conversations. Two representations of the residual are compared, namely real-cepstrum and MFCC, with the latter performing better. Experiments on RT06eval show that residual with subband information from 2.5 kHz to 3.5 kHz and spectral slope yields a performance close to traditional MFCC features. As a way to objectively evaluate privacy in terms of linguistic information, we perform phoneme recognition. Residual features yield low phoneme accuracies compared to traditional MFCC features.

Details

Title LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization

Author(s) Parthasarathi, Sree Hari Krishnan ; Bourlard, Hervé ; Gatica-Perez, Daniel

Published in Interspeech 2011

Pages 1045-1048

Conference Interspeech

Date 2011

DOI https://doi.org/10.21437/Interspeech.2011-390

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2011-07-06

Actions

Preview

Select file: