conference paper not in proceedings
pyannote.audio: neural building blocks for speaker diarization
2020
We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding – reaching state-of-the-art performance for most of them.
Type
conference paper not in proceedings
ArXiv ID
1911.01255
Author(s)
Bredin, Herve
Yin, Ruiqing
Coria, Juan Manuel
Korshunov, Pavel
Lavechin, Marvin
Fustes, Diego
Titeux, Hadrien
Bouaziz, Wassim
Gill, Marie-Philippe
Date Issued
2020
Editorial or Peer reviewed
REVIEWED
Written at
EPFL
EPFL units
Available on Infoscience
May 27, 2020
Use this identifier to reference this record