Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization
 
conference paper

Robust and Discriminative Speaker Embedding via Intra-Class Distance Variance Regularization

Le, Nam
•
Odobez, Jean-Marc
2018
19Th Annual Conference Of The International Speech Communication Association (Interspeech 2018), Vols 1-6
Proceedings of Interspeech

Learning a good speaker embedding is critical for many speech processing tasks, including recognition, verification, and diarization. To this end, we propose a complementary optimizing goal called intra-class loss to improve deep speaker embed dings learned with triplet loss. This loss function is formulated as a soft constraint on the averaged pair-wise distance between samples from the same class. Its goal is to prevent the scattering of these samples within the embedding space to increase the intra-class compactncss.When intra-class loss is jointly optimized with triplet loss, we can observe 2 major improvements: the deep embedding network can achieve a more robust and discriminative representation and the training process is more stable with a faster convergence rate. We conduct experiments on 2 large public benchmarking datasets for speaker verification, VoxCeleb and VoxForge. The results show that intra-class loss helps accelerating the convergence of deep network training and significantly improves the overall performance of the resulted embeddings.

  • Details
  • Metrics
Type
conference paper
DOI
10.21437/Interspeech.2018-1685
Web of Science ID

WOS:000465363900473

Author(s)
Le, Nam
Odobez, Jean-Marc
Date Issued

2018

Publisher

ISCA-INT SPEECH COMMUNICATION ASSOC

Publisher place

Baixas

Published in
19Th Annual Conference Of The International Speech Communication Association (Interspeech 2018), Vols 1-6
ISBN of the book

978-1-5108-7221-9

Series title/Series vol.

Interspeech

Start page

2257

End page

2261

Subjects

speaker verification

•

deep neural networks

•

embedding learning

•

triplet loss

URL

Related documents

http://publications.idiap.ch/downloads/papers/2018/Le_INTERSPEECH2018_2018.pdf
Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent placeEvent date
Proceedings of Interspeech

Hyderabad, INDIA

Aug 02-Sep 06, 2018

Available on Infoscience
July 26, 2018
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/147529
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés