Towards Robust and Adaptive Speech Recognition Models

Bourlard, Hervé; Bengio, Samy; Weber, Katrin

doi:10.1007/978-1-4419-9017-4_9

book part or chapter

Towards Robust and Adaptive Speech Recognition Models

Bourlard, Hervé

•

Bengio, Samy

•

Weber, Katrin

2004

Mathematical Foundations of Speech and Language Processing

In this paper, we discuss a family of new Automatic Speech Recognition (ASR) approaches, which somewhat deviate from the usual ASR approaches but which have recently been shown to be more robust to nonstationary noise, without requiring specific adaptation or multi-style'' training. More specifically, we will motivate and briefly describe new approaches based on multi-stream and subband ASR. These approaches extend the standard hidden Markov model (HMM) based approach by assuming that the different (frequency) streams representing the speech signal are processed by different (independent) experts'', each expert focusing on a different characteristic of the signal, and that the different stream likelihoods (or posteriors) are combined at some (temporal) stage to yield a global recognition output. As a further extension to multi-stream ASR, we will finally introduce a new approach, referred to as HMM2, where the HMM emission probabilities are estimated via state specific feature based HMMs responsible for merging the stream information and modeling their possible correlation.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/228166

Name

rr02-47.pdf

Access type

openaccess

Size

369.22 KB

Format

Adobe PDF

Checksum (MD5)

d5185da0cd8911d3d3ddf35dca52276e