Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Analysis of Language Dependent Front-End for Speaker Recognition
 
conference paper

Analysis of Language Dependent Front-End for Speaker Recognition

Madikeri, Srikanth
•
Dey, Subhadeep
•
Motlicek, Petr  
January 1, 2018
19Th Annual Conference Of The International Speech Communication Association (Interspeech 2018), Vols 1-6
19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018)

In Deep Neural Network (DNN) i-vector based speaker recognition systems, acoustic models trained for Automatic Speech Recognition are employed to estimate sufficient statistics for i-vector modeling. The DNN based acoustic model is typically trained on a wellresourced language like English. In evaluation conditions where enrollment and test data are not in English, as in the NIST SRE 2016 dataset, a DNN acoustic model generalizes poorly. In such conditions, a conventional Universal Background Model/Gaussian Mixture Model (UBM/GMM) based i-vector extractor performs better than the DNN based i-vector system. In this paper, we address the scenario in which one can develop a Automatic Speech Recognizer with limited resources for a language present in the evaluation condition, thus enabling the use of a DNN acoustic model instead of UBM/GMM. Experiments are performed on the Tagalog subset of the NIST SRE 2016 dataset assuming an open training condition. With a DNN i-vector system trained for Tagalog, a relative improvement of 12.1% is obtained over a baseline system trained for English.

  • Details
  • Metrics
Type
conference paper
DOI
10.21437/Interspeech.2018-2071
Web of Science ID

WOS:000465363900231

Author(s)
Madikeri, Srikanth
Dey, Subhadeep
Motlicek, Petr  
Date Issued

2018-01-01

Publisher

ISCA-INT SPEECH COMMUNICATION ASSOC

Publisher place

Baixas

Published in
19Th Annual Conference Of The International Speech Communication Association (Interspeech 2018), Vols 1-6
ISBN of the book

978-1-5108-7221-9

Series title/Series vol.

Interspeech

Start page

1101

End page

1105

Subjects

Computer Science, Artificial Intelligence

•

Computer Science, Theory & Methods

•

Engineering, Electrical & Electronic

•

Computer Science

•

Engineering

•

i-vector

•

speaker recognition

•

deep neural networks

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent placeEvent date
19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018)

Hyderabad, INDIA

Aug 02-Sep 06, 2018

Available on Infoscience
June 18, 2019
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/156868
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés