System fusion and speaker linking for longitudinal diarization of TV shows

Ferras, Marc; Madikeri, Srikanth; Motlicek, Petr; Bourlard, Hervé

doi:10.1109/ICASSP.2016.7472728

Ferras, Marc; Madikeri, Srikanth; Motlicek, Petr; Bourlard, Hervé

2016

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Performing speaker diarization while uniquely identifying the speakers in a collection of audio recordings is a challenging task. Based on our previous work on speaker diarization and linking, we developed a system for diarizing longitudinal TV show data sets based on the fusion of speaker diarization system outputs and speaker linking. Agreement between multiple diarization outputs is found prior to speaker linking, largely reducing the diarization error rate at the expense of keeping some speech data unlabelled. To deal with noisy clusters, a linear prediction based technique was used to label speakers after linking. Considerable gains for both fusion and labelling are reported. Despite the challenges of the longitudinal diarization task, this system obtained similar performance for linked and non-linked tasks under moderate session variability, highlighting the viability of a linking approach to longitudinal diarization of speech in the presence of noise, music and special audio effects.

Details

Title System fusion and speaker linking for longitudinal diarization of TV shows

Author(s) Ferras, Marc ; Madikeri, Srikanth ; Motlicek, Petr ; Bourlard, Hervé

Published in 2016 Ieee International Conference On Acoustics, Speech And Signal Processing Proceedings

Pagination 5

Pages 5495-5499

Conference Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016), Shanghai

Date 2016

Publisher New York, IEEE

ISSN 1520-6149

ISBN 978-1-4799-9988-0

Keywords

speaker diarization; linking; longitudinal; fusion; clustering; i-vector; ward

DOI https://doi.org/10.1109/ICASSP.2016.7472728

Other identifier(s) View record in Web of Science

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2016-05-19

Abstract

Details

Actions