Robust Discriminative Keyword Spotting for Emotionally Colored Spontaneous Speech using Bidirectional LSTM Networks

Wöllmer, Martin; Eyben, Florian; Keshet, Joseph; Graves, Alex; Schuller, Björn; Rigoll, Gerhard

doi:10.1109/ICASSP.2009.4960492

Wöllmer, Martin; Eyben, Florian; Keshet, Joseph; Graves, Alex; Schuller, Björn; Rigoll, Gerhard

2009

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Résumé

In this paper we propose a new technique for robust keyword spotting that uses bidirectional Long Short-Term Memory (BLSTM) recurrent neural nets to incorporate contextual information in speech decoding. Our approach overcomes the drawbacks of generative HMM modeling by applying a discriminative learning procedure that non-linearly maps speech features into an abstract vector space. By incorporating the outputs of a BLSTM network into the speech features, it is able to make use of past and future context for phoneme predictions. The robustness of the approach is evaluated on a keyword spotting task using the HUMAINE Sensitive Artificial Listener (SAL) database, which contains accented, spontaneous, and emotionally colored speech. The test is particularly stringent because the system is not trained on the SAL database, but only on the TIMIT corpus of read speech. We show that our method prevails over a discriminative keyword spotter without BLSTM-enhanced feature functions, which in turn has been proven to outperform HMM-based techniques.

Détails

Titre Robust Discriminative Keyword Spotting for Emotionally Colored Spontaneous Speech using Bidirectional LSTM Networks

Auteur(s) Wöllmer, Martin ; Eyben, Florian ; Keshet, Joseph ; Graves, Alex ; Schuller, Björn ; Rigoll, Gerhard

Publié dans 2009 IEEE International Conference on Acoustics, Speech and Signal Processing

Pages 3949-3952

Présenté à IEEE International Conference on Acoustic, Speech, and Signal Processing, Taipei, Taiwan

Date 2009

DOI https://doi.org/10.1109/ICASSP.2009.4960492

Lien supplémentaire URL

Laboratoires LIDIAP

Le document apparaît dans Production scientifique et compétences > STI - Faculté des sciences et techniques de l'ingénieur > IEM - Institute of Electrical and Micro Engineering > LIDIAP - Laboratoire de l'IDIAP
Production scientifique et compétences > Euler Center for Signal Processing
Papiers de conférence
Travail produit à l'EPFL
Publié

Date de création de la notice 2010-02-11

Files

Résumé

Détails

PDF