Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Books and Book parts
  4. Discriminative Keyword Spotting
 
Loading...
Thumbnail Image
book part or chapter

Discriminative Keyword Spotting

Grangier, David  
•
Keshet, Joseph
•
Bengio, Samy  
Keshet, Joseph
•
Bengio, Samy  
2009
Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

This chapter introduces a discriminative method for detecting and spotting keywords in spoken utterances. Given a word represented as a sequence of phonemes and a spoken utterance, the keyword spotter predicts the best time span of the phoneme sequence in the spoken utterance along with a confidence. If the prediction confidence is above certain level the keyword is declared to be spoken in the utterance within the predicted time span, otherwise the keyword is declared as not spoken. The problem of keyword spotting training is formulated as a discriminative task where the model parameters are chosen so the utterance in which the keyword is spoken would have higher confidence than any other spoken utterance in which the keyword is not spoken. It is shown theoretically and empirically that the proposed training method resulted with a high area under the receiver operating characteristic (ROC) curve, the most common measure to evaluate keyword spotters. We present an iterative algorithm to train the keyword spotter efficiently. The proposed approach contrasts with standard spotting strategies based on HMMs, for which the training procedure does not maximize a loss directly related to the spotting performance. Several experiments performed on TIMIT and WSJ corpora show the advantage of our approach over HMM-based alternatives.

  • Details
  • Metrics
Type
book part or chapter
DOI
10.1002/9780470742044.ch11
Author(s)
Grangier, David  
•
Keshet, Joseph
•
Bengio, Samy  
Editors
Keshet, Joseph
•
Bengio, Samy  
Date Issued

2009

Publisher

John Wiley and Sons

Published in
Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
ISBN of the book

9780470696835

Start page

173

End page

194

Written at

EPFL

EPFL units
LIDIAP  
Available on Infoscience
February 11, 2010
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/46806
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés