Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Exploiting sequence information for text-dependent Speaker Verification
 
conference paper

Exploiting sequence information for text-dependent Speaker Verification

Dey, Subhadeep
•
Motlicek, Petr
•
Madikeri, Srikanth
Show more
2017
2017 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)
Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing

Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector and relevance Maximum-a-Posteriori (MAP), have shown to provide state-of-the-art performance for text-dependent systems with fixed phrases. The performance of i-vector and JFA models has been further enhanced by estimating posteriors from Deep Neural Network (DNN) instead of Gaussian Mixture Model (GMM). While both DNNs and GMMs aim at incorporating phonetic information of the phrase with these posteriors, model-based SV approaches ignore the sequence information of the phonetic units of the phrase. In this paper, we tackle this issue by applying dynamic time warping using speaker-informative features. We propose to use i-vectors computed from short segments of each speech utterance, also called online i-vectors, as feature vectors. The proposed approach is evaluated on the RedDots database and provides an improvement of 75% relative equal error rate over the best model-based SV baseline system in a content-mismatch condition.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/ICASSP.2017.7953182
Web of Science ID

WOS:000414286205106

Author(s)
Dey, Subhadeep
Motlicek, Petr
Madikeri, Srikanth
Ferras, Marc
Date Issued

2017

Publisher

Ieee

Publisher place

New York

Published in
2017 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)
ISBN of the book

978-1-5090-4117-6

Total of pages

5

Start page

5370

End page

5374

Subjects

Text-dependent speaker verification

•

DNN posteriors

•

Dynamic Time Warping

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent placeEvent date
Proceedings of 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing

New Orleans, LA

March 05-09, 2017

Available on Infoscience
December 19, 2016
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/132080
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés