Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Sparse Modeling of Neural Network Posterior Probabilities for Exemplar-based Speech Recognition
 
research article

Sparse Modeling of Neural Network Posterior Probabilities for Exemplar-based Speech Recognition

Dighe, Pranay
•
Asaei, Afsaneh  
•
Bourlard, Hervé  
2016
Speech Communication

In this paper, a compressive sensing (CS) perspective to exemplar-based speech processing is proposed. Relying on an analytical relationship between CS formulation and statistical speech recognition (Hidden Markov Models HMM), the automatic speech recognition (ASR) problem is cast as recovery of high-dimensional sparse word representation from the observed low-dimensional acoustic features. The acoustic features are exemplars obtained from (deep) neural network sub-word conditional posterior probabilities. Low-dimensional word manifolds are learned using these sub-word posterior exemplars and exploited to construct a linguistic dictionary for sparse representation of word posteriors. Dictionary learning has been found to be a principled way to alleviate the need of having huge collection of exemplars as required in conventional exemplar-based approaches, while still improving the performance. Context appending and collaborative hierarchical sparsity are used to exploit the sequential and group structure underlying word sparse representation. This formulation leads to a posterior-based sparse modeling approach to speech recognition. The potential of the proposed approach is demonstrated on isolated word (Phonebook corpus) and continuous speech (Numbers corpus) recognition tasks.

  • Files
  • Details
  • Metrics
Type
research article
DOI
10.1016/j.specom.2015.06.002
Web of Science ID

WOS:000389009300015

Author(s)
Dighe, Pranay
Asaei, Afsaneh  
Bourlard, Hervé  
Date Issued

2016

Published in
Speech Communication
Volume

76

Start page

230

End page

244

Subjects

Automatic speech recognition

•

Deep neural network posterior features

•

Compressive sensing

•

Sparse word posterior probabilities

•

Dictionary learning

•

Sparse modeling

Note

Speech Communication: Special Issue on Advances in Sparse Modeling and Low-rank Modeling for Speech Processing

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Available on Infoscience
June 19, 2015
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/115238
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés