Enhancing posterior based speech recognition systems
The use of local phoneme posterior probabilities has been increasingly explored for improving speech recognition systems. Hybrid hidden Markov model / artificial neural network (HMM/ANN) and Tandem are the most successful examples of such systems. In this thesis, we present a principled framework for enhancing the estimation of local posteriors, by integrating phonetic and lexical knowledge, as well as long contextual information. This framework allows for hierarchical estimation, integration and use of local posteriors from the phoneme up to the word level. We propose two approaches for enhancing the posteriors. In the first approach, phoneme posteriors estimated with an ANN (particularly multi-layer Perceptron – MLP) are used as emission probabilities in HMM forward-backward recursions. This yields new enhanced posterior estimates integrating HMM topological constraints (encoding specific phonetic and lexical knowledge), and long context. In the second approach, a temporal context of the regular MLP posteriors is post-processed by a secondary MLP, in order to learn inter and intra dependencies among the phoneme posteriors. The learned knowledge is integrated in the posterior estimation during the inference (forward pass) of the second MLP, resulting in enhanced posteriors. The use of resulting local enhanced posteriors is investigated in a wide range of posterior based speech recognition systems (e.g. Tandem and hybrid HMM/ANN), as a replacement or in combination with the regular MLP posteriors. The enhanced posteriors consistently outperform the regular posteriors in different applications over small and large vocabulary databases.
Keywords: Posterior Based ASR ; Artificial Neural Networks ; Local Posteriors ; Context ; Phonetic and Lexical Knowledge ; Enhanced Posteriors ; ASR basé sur les probabilités a posteriori ; réseaux de neurones artificiels ; probabilité locale a posteriori ; information contextuelle ; connaissance phonétique et lexicale ; estimations améliorées des probabilités a posterioriThèse École polytechnique fédérale de Lausanne EPFL, n° 4218 (2008)
Section de génie électrique et électronique
Faculté des sciences et techniques de l'ingénieur
Institut de génie électrique et électronique
Laboratoire de l'IDIAP
Record created on 2008-09-05, modified on 2016-08-08