Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Thomas, Samuel; Ganapathy, Sriram; Hermansky, Hynek

Thomas, Samuel; Ganapathy, Sriram; Hermansky, Hynek

2008

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Frequency Domain Linear Prediction (FDLP) provides an efficient way to represent temporal envelopes of a signal using auto-regressive models. For the input speech signal, we use FDLP to estimate temporal trajectories of sub-band energy by applying linear prediction on the cosine transform of sub-band signals. The sub-band FDLP envelopes are used to extract spectral and temporal features for speech recognition. The spectral features are derived by integrating the temporal envelopes in short-term frames and the temporal features are formed by converting these envelopes into modulation frequency components. These features are then combined in the phoneme posterior level and used as the input features for a hybrid HMM-ANN based phoneme recognizer. The proposed spectro-temporal features provide a phoneme recognition accuracy of $69.1 \%$ (an improvement of $4.8 \%$ over the Perceptual Linear Prediction (PLP) base-line) for the TIMIT database.

Details

Title Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Author(s) Thomas, Samuel ; Ganapathy, Sriram ; Hermansky, Hynek

Conference EUSIPCO 2008

Date 2008

Note IDIAP-RR 08-05

Additional link URL; Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2010-02-11

Actions

Preview

Select file: