KL Realignment for Speaker Diarization with Multiple Feature Streams

Vijayasenan, Deepu; Valente, Fabio; Bourlard, Hervé

doi:10.21437/Interspeech.2009-325

Vijayasenan, Deepu; Valente, Fabio; Bourlard, Hervé

2009

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

This paper aims at investigating the use of Kullback-Leibler (KL) divergence based realignment with application to speaker diarization. The use of KL divergence based realignment operates directly on the speaker posterior distribution estimates and is compared with traditional realignment performed using HMM/GMM system. We hypothesize that using posterior estimates to re-align speaker boundaries is more robust than gaussian mixture models in case of multiple feature streams with different statistical properties. Experiments are run on the NIST RT06 data. These experiments reveal that in case of conventional MFCC features the two approaches yields the same performance while the KL based system outperforms the HMM/GMM re-alignment in case of combination of multiple feature streams (MFCC and TDOA).

Détails

Titre KL Realignment for Speaker Diarization with Multiple Feature Streams

Auteur(s) Vijayasenan, Deepu ; Valente, Fabio ; Bourlard, Hervé

Publié dans Interspeech 2009

Pages 1059-1062

Présenté à 10th Annual Conference of the International Speech Communication Association

Date 2009

DOI https://doi.org/10.21437/Interspeech.2009-325

Lien supplémentaire Related documents

Laboratoires LIDIAP

Le document apparaît dans Production scientifique et compétences > STI - Faculté des sciences et techniques de l'ingénieur > IEM - Institute of Electrical and Micro Engineering > LIDIAP - Laboratoire de l'IDIAP
Production scientifique et compétences > Euler Center for Signal Processing
Papiers de conférence
Travail produit à l'EPFL
Publié

Date de création de la notice 2010-02-11

Résumé

Détails

Actions