On Factorizing Spectral Dynamics for Robust Speech Recognition

In this paper, we introduce new dynamic speech features based on the modulation spectrum. These features, termed Mel-cepstrum Modulation Spectrum (MCMS), map the time trajectories of the spectral dynamics into a series of slow and fast moving orthogonal components, providing a more general and discriminative range of dynamic features than traditional delta and acceleration features. The features can be seen as the outputs of an array of band-pass filters spread over the cepstral modulation frequency range of interest. In experiments, it is shown that, as well as providing a slight improvement in clean conditions, these new dynamic features yield a significant increase in speech recognition performance in various noise conditions when compared directly to the standard temporal derivative features and RASTA-PLP features.


Year:
2003
Publisher:
IDIAP
Keywords:
Note:
in proceedings of Eurospeech 2003
Laboratories:




 Record created 2006-03-10, last modified 2018-03-17

n/a:
Download fulltextPDF
External link:
Download fulltextURL
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)