Hierarchical Multi-Stream Posterior Based Speech Recognition System

Ketabdar, Hamed; Bourlard, Hervé; Bengio, Samy

Ketabdar, Hamed; Bourlard, Hervé; Bengio, Samy

2005

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of features and taking into account acoustic context (e.g., as available in the whole utterance), as well as possible prior information (such as topological constraints). These posteriors are estimated based on ``state gamma posterior'' definition (typically used in standard HMMs training) extended to the case of multi-stream HMMs.%, resulting in new features. This approach provides a new, principled, theoretical framework for hierarchical estimation/use of posteriors, multi-stream feature combination, and integrating appropriate context and prior knowledge in posterior estimates. In the present work, we used the resulting gamma posteriors as features for a standard HMM/GMM layer. On the OGI Digits database and on a reduced vocabulary version (1000 words) of the DARPA Conversational Telephone Speech-to-text (CTS) task, this resulted in significant performance improvement, compared to the state-of-the-art Tandem systems.

Details

Title Hierarchical Multi-Stream Posterior Based Speech Recognition System

Author(s) Ketabdar, Hamed ; Bourlard, Hervé ; Bengio, Samy

Date 2005

Publisher IDIAP

Keywords

speech

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports
Published

Record creation date 2006-03-10

Actions

Preview

Select file: