IDIAP HMM/HMM2 System: Theoretical Basis and Software Specifications

Ikbal, Shajith; Bourlard, Hervé; Bengio, Samy; Weber, Katrin

Ikbal, Shajith; Bourlard, Hervé; Bengio, Samy; Weber, Katrin

2001

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

State-of-the-art Automatic Speech Recognition (ASR) systems make extensive use of Hidden Markov Models (HMMs), characterized by flexible statistical modeling, powerful optimization (training) techniques and efficient recognition algorithms. When allowed by the software implementation, their flexibility can also be fully exploited in research, by testing various topologies, acoustic units, parameterization schemes, etc. Unfortunately, these HMM systems still suffer from an excessive sensitivity to the variability generally observed in real acoustic environments, including speaker, channel and noise characteristics. In an attempt to tackle this problem, IDIAP recently introduced a new form of HMM, referred to as HMM2, exhibiting numerous potential advantages, which could result in improved robustness of current speech recognition systems. HMM2 can be described as a mixture of HMMs where the HMM emission probabilities (usually estimated by Gaussian Mixtures or a neural network) are themselves estimated by state-specific HMMs working along the acoustic features. Among other properties, it is believed that such HMM2 approach could better model the time/frequency speech flow, including better modeling of the feature correlation. After a brief reminder of the HMM theory, this report first introduces the theoretical basis of HMM2, including their parameterization schemes and the estimation of their parameters through a generalized form of the Expectation-Maximization (EM) training algorithm. It is also the goal of the present report to describe the functionalities and specifications of a new software able to handle, in a flexible way, different forms of HMM and HMM2 topologies and training schemes.

Details

Title IDIAP HMM/HMM2 System: Theoretical Basis and Software Specifications

Author(s) Ikbal, Shajith ; Bourlard, Hervé ; Bengio, Samy ; Weber, Katrin

Date 2001

Publisher Martigny, Switzerland, IDIAP

Keywords

speech; ikbal; bourlard; Bengio; Weber

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports
Published

Record creation date 2006-03-10

Actions

Preview

Select file: