Spectral Entropy Based Feature for Robust ASR

In general, entropy gives us a measure of the number of bits required to represent some information. When applied to probability mass function (PMF), entropy can also be used to measure the ``peakiness'' of a distribution. In this paper, we propose using the entropy of short time Fourier transform spectrum, normalised as PMF, as an additional feature for automatic speech recognition (ASR). It is indeed expected that a peaky spectrum, representation of clear formant structure in the case of voiced sounds, will have low entropy, while a flatter spectrum corresponding to non-speech or noisy regions will have higher entropy. Extending this reasoning further, we introduce the idea of multi-band/multi-resolution entropy feature where we divide the spectrum into equal size sub-bands and compute entropy in each sub-band. The results presented in this paper show that multi-band entropy features used in conjunction with normal cepstral features improve the performance of ASR system.

Martigny, Switzerland, IDIAP
in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing {(ICASSP)}, 2004

Note: The status of this file is: Anyone

 Record created 2006-03-10, last modified 2020-07-30

Download fulltextPDF
External link:
Download fulltextURL
Rate this document:

Rate this document:
(Not yet reviewed)