Sparse Modeling of Neural Network Posterior Probabilities for Exemplar-based Speech Recognition

Dighe, Pranay; Asaei, Afsaneh; Bourlard, Hervé

doi:10.1016/j.specom.2015.06.002

Dighe, Pranay; Asaei, Afsaneh; Bourlard, Hervé

2016

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this paper, a compressive sensing (CS) perspective to exemplar-based speech processing is proposed. Relying on an analytical relationship between CS formulation and statistical speech recognition (Hidden Markov Models HMM), the automatic speech recognition (ASR) problem is cast as recovery of high-dimensional sparse word representation from the observed low-dimensional acoustic features. The acoustic features are exemplars obtained from (deep) neural network sub-word conditional posterior probabilities. Low-dimensional word manifolds are learned using these sub-word posterior exemplars and exploited to construct a linguistic dictionary for sparse representation of word posteriors. Dictionary learning has been found to be a principled way to alleviate the need of having huge collection of exemplars as required in conventional exemplar-based approaches, while still improving the performance. Context appending and collaborative hierarchical sparsity are used to exploit the sequential and group structure underlying word sparse representation. This formulation leads to a posterior-based sparse modeling approach to speech recognition. The potential of the proposed approach is demonstrated on isolated word (Phonebook corpus) and continuous speech (Numbers corpus) recognition tasks.

Details

Title Sparse Modeling of Neural Network Posterior Probabilities for Exemplar-based Speech Recognition

Author(s) Dighe, Pranay ; Asaei, Afsaneh ; Bourlard, Hervé

Published in Speech Communication

Volume 76

Pages 230-244

Date 2016

ISSN 0167-6393

Keywords

Automatic speech recognition; Deep neural network posterior features; Compressive sensing; Sparse word posterior probabilities; Dictionary learning; Sparse modeling

Note Speech Communication: Special Issue on Advances in Sparse Modeling and Low-rank Modeling for Speech Processing

DOI https://doi.org/10.1016/j.specom.2015.06.002

Other identifier(s) View record in Web of Science

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2015-06-19

Actions

Preview

Select file: