On Joint Modelling of Grapheme and Phoneme Information using KL-HMM for ASR

In this paper, we propose a simple approach to jointly model both grapheme and phoneme information using Kullback-Leibler divergence based HMM (KL-HMM) system. More specifically, graphemes are used as subword units and phoneme posterior probabilities estimated at output of multilayer perceptron are used as observation feature vector. Through preliminary studies on DARPA Resource Management corpus it is shown that although the proposed approach yield lower performance compared to KL-HMM system using phoneme as subword units, this gap in the performance can be bridged via temporal modelling at the observation feature vector level and contextual modelling of early tagged contextual graphemes.

Related material