An Information Theoretic Approach to Speaker Diarization of Meeting Data

Vijayasenan, Deepu; Valente, Fabio; Bourlard, Hervé

doi:10.1109/TASL.2009.2015698

Vijayasenan, Deepu; Valente, Fabio; Bourlard, Hervé

2009

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

A speaker diarization system based on an information theoretic framework is described. The problem is formulated according to the Information Bottleneck (IB) principle. Unlike other approaches where the distance between speaker segments is arbitrarily introduced, the IB method seeks the partition that maximizes the mutual information between observations and variables relevant for the problem while minimizing the distortion between observations. This solves the problem of choosing the distance between speech segments, which becomes the Jensen- Shannon divergence as it arises from the IB objective function optimization. We discuss issues related to speaker diarization using this information theoretic framework such as the criteria for inferring the number of speakers, the trade-off between quality and compression achieved by the diarization system, and the algorithms for optimizing the objective function. Furthermore we benchmark the proposed system against a state-of-the-art system on the NIST RT06 (Rich Transcription) data set for speaker diarization of meetings. The IB based system achieves a Diarization Error Rate of 23.2% compared to 23.6% for the baseline system. This approach being mainly based on nonparametric clustering, it runs significantly faster than the baseline HMM/GMM based system, resulting in faster-than-real-time diarization.

Details

Title An Information Theoretic Approach to Speaker Diarization of Meeting Data

Author(s) Vijayasenan, Deepu ; Valente, Fabio ; Bourlard, Hervé

Published in IEEE Transactions on Audio Speech and Language Processing

Volume 17

Issue 7

Pages 1382-1393

Date 2009

DOI https://doi.org/10.1109/TASL.2009.2015698

Additional link URL; Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2010-02-11

Actions

Preview

Select file: