Infoscience

Report

Hierarchical Tandem Features for ASR in Mandarin

We apply multilayer perceptron (MLP) based hierarchical Tandem features to large vocabulary continuous speech recognition in Mandarin. Hierarchical Tandem features are estimated using a cascade of two MLP classifiers which are trained independently. The first classifier is trained on perceptual linear predictive coefficients with a 90 ms temporal context. The second classifier is trained using the phonetic class conditional probabilities estimated by the first MLP, but with a relatively longer temporal context of about 150 ms. Experiments on the Mandarin DARPA GALE eval06 dataset show significant reduction (about 7.6% relative) in character error rates by using hierarchical Tandem features over conventional Tandem features.

    Reference

    • EPFL-REPORT-155014

    Record created on 2010-11-17, modified on 2016-08-08

Related material