Infoscience

Report

Spectral Entropy Feature in Multi-stream for Robust ASR

In recent papers, entropy computed from sub-bands of the spectrum was used as a feature for automatic speech recognition. In the present paper, we further study the sub-band spectral entropy features which can give the flatness/peakiness of the sub-band spectrum and in turn the position of the formants in the spectrum. The sub-band spectral entropy features are used in hybrid hidden Markov model/artificial neural network systems and are found to be noise robust. The spectral entropy features are investigated along with PLP features in multi-stream combination. Separate multi-layer perceptrons (MLPs) are trained for PLP features, spectral entropy features and both the features concatenated. The output posteriors of the three MLPs are combined after weighting such that the weight to a particular MLP's outputs are inversely proportional to the entropy of the output posterior distributions of that MLP. In Tandem framework, the combined output, after decorrelation, is fed to standard hidden Markov model/Gaussian mixture model system. Significant improvement in performance is reported when spectral entropy features are used along with PLP features in multi-stream combination.

Related material