The Speed Submission to DIHARD II: Contributions & Lessons Learned

Sahidullah, Md; Patino, Jose; Cornell, Samuele; Yin, Ruiqing; Sivasankaran, Sunit; Bredin, Herve; Korshunov, Pavel; Brutti, Alessio; Serizel, Romain; Vincent, Emmanuel; Evans, Nicholas; Marcel, Sébastien; Squartini, Stefano; Barras, Claude

report

Sahidullah, Md

•

Patino, Jose

•

Cornell, Samuele

2019

This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team. Besides describing the system, which considerably outperformed the challenge baselines, we also focus on the lessons learned from numerous approaches that we tried for single and multi-channel systems. We present several components of our diarization system, including categorization of domains, speech enhancement, speech activity detection, speaker embeddings, clustering methods, resegmentation, and system fusion. We analyze and discuss the effect of each such component on the overall diarization performance within the realistic settings of the challenge.

Type

report

Author(s)

Sahidullah, Md

Patino, Jose

Cornell, Samuele

Yin, Ruiqing

Sivasankaran, Sunit

Bredin, Herve

Korshunov, Pavel

Brutti, Alessio

Serizel, Romain

Vincent, Emmanuel

Date Issued

2019

Publisher

Idiap

Subjects

diarization

•

DIHARD challenge

•

evaluation

•

single-channel and multi-channel speech

URL

http://publications.idiap.ch/downloads/reports/2019/Sahidullah_Idiap-RR-14-2019.pdf

Written at

EPFL

EPFL units

LIDIAP

Available on Infoscience

February 18, 2020

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/166350