Introducing Temporal Asymmetries in Feature Extraction for Automatic Speech Recognition
We propose a new auditory inspired feature extraction technique for automatic speech recognition (ASR). Features are extracted by filtering the temporal trajectory of spectral energies in each critical band of speech by a bank of finite impulse response (FIR) filters. Impulse responses of these filters are derived from a modified Gabor envelope in order to emulate asymmetries of the temporal receptive field (TRF) profiles observed in higher level auditory neurons. We obtain $11.4% $ relative improvement in word error rate on OGI-Digits database and, $3.2%$ relative improvement in phoneme error rate on TIMIT database over the MRASTA technique.
- View record in Web of Science
- URL: http://publications.idiap.ch/downloads/papers/2008/sgarimel-is-2008.pdf
- Related documents: http://publications.idiap.ch/index.php/publications/showcite/sgarimel:rr08-25
Record created on 2010-02-11, modified on 2016-08-08