Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. System fusion and speaker linking for longitudinal diarization of TV shows
 
conference paper

System fusion and speaker linking for longitudinal diarization of TV shows

Ferras, Marc
•
Madikeri, Srikanth
•
Motlicek, Petr
Show more
2016
2016 Ieee International Conference On Acoustics, Speech And Signal Processing Proceedings
Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016)

Performing speaker diarization while uniquely identifying the speakers in a collection of audio recordings is a challenging task. Based on our previous work on speaker diarization and linking, we developed a system for diarizing longitudinal TV show data sets based on the fusion of speaker diarization system outputs and speaker linking. Agreement between multiple diarization outputs is found prior to speaker linking, largely reducing the diarization error rate at the expense of keeping some speech data unlabelled. To deal with noisy clusters, a linear prediction based technique was used to label speakers after linking. Considerable gains for both fusion and labelling are reported. Despite the challenges of the longitudinal diarization task, this system obtained similar performance for linked and non-linked tasks under moderate session variability, highlighting the viability of a linking approach to longitudinal diarization of speech in the presence of noise, music and special audio effects.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/ICASSP.2016.7472728
Web of Science ID

WOS:000388373405129

Author(s)
Ferras, Marc
Madikeri, Srikanth
Motlicek, Petr
Bourlard, Hervé
Date Issued

2016

Publisher

IEEE

Publisher place

New York

Published in
2016 Ieee International Conference On Acoustics, Speech And Signal Processing Proceedings
ISBN of the book

978-1-4799-9988-0

Total of pages

5

Start page

5495

End page

5499

Subjects

speaker diarization

•

linking

•

longitudinal

•

fusion

•

clustering

•

i-vector

•

ward

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent place
Proceedings of 2016 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2016)

Shanghai

Available on Infoscience
May 19, 2016
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/126198
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés