Improving Grapheme-based ASR by Probabilistic Lexical Modeling Approach

There is growing interest in using graphemes as subword units, especially in the context of the rapid development of hidden Markov model (HMM) based automatic speech recognition (ASR) system, as it eliminates the need to build a phoneme pronunciation lexicon. However, directly modeling the relationship between acoustic feature observations and grapheme states may not be always trivial. It usually depends upon the grapheme-to-phoneme relationship within the language. This paper builds upon our recent interpretation of Kullback-Leibler divergence based HMM (KL-HMM) as a probabilistic lexical modeling approach to propose a novel grapheme-based ASR approach where, first a set of acoustic units are derived by modeling context-dependent graphemes in the framework of conventional HMM/Gaussian mixture model (HMM/GMM) system, and then the probabilistic relationship between the derived acoustic units and the lexical units representing graphemes is modeled in the framework of KL-HMM. Through experimental studies on English, where the grapheme-to-phoneme relationship is irregular, we show that the proposed grapheme-based ASR approach (without using any phoneme information) can achieve performance comparable to standard phoneme-based ASR approach.

Presented at:
Proceedings of Interspeech

 Record created 2013-12-19, last modified 2018-03-17

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)