On Factorizing Spectral Dynamics for Robust Speech Recognition

Tyagi, Vivek; McCowan, Iain A.; Bourlard, Hervé; Misra, Hemant

doi:10.21437/Eurospeech.2003-338

conference paper

On Factorizing Spectral Dynamics for Robust Speech Recognition

Tyagi, Vivek

•

McCowan, Iain A.

•

Bourlard, Hervé

more

2003

8th European Conference on Speech Communication and Technology (Eurospeech 2003)

Eurospeech

In this paper, we introduce new dynamic speech features based on the modulation spectrum. These features, termed Mel-cepstrum Modulation Spectrum (MCMS), map the time trajectories of the spectral dynamics into a series of slow and fast moving orthogonal components, providing a more general and discriminative range of dynamic features than traditional delta and acceleration features. The features can be seen as the outputs of an array of band-pass filters spread over the cepstral modulation frequency range of interest. In experiments, it is shown that, as well as providing a slight improvement in clean conditions, these new dynamic features yield a significant increase in speech recognition performance in various noise conditions when compared directly to the standard temporal derivative features and RASTA-PLP features.

Name

mcms_rr.pdf

Access type

openaccess

Size

156.31 KB

Format

Adobe PDF

Checksum (MD5)

790be4e93ffa376557724ec50f7f85ee