Comparison and Combination of Features in a Hybrid HMM/MLP and a HMM/GMM Speech Recognition System

Pujol, Pere; Pol, Susagna; Nadeu, Climent; Hagen, Astrid; Bourlard, Hervé

Pujol, Pere; Pol, Susagna; Nadeu, Climent; Hagen, Astrid; Bourlard, Hervé

2003

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Recently, the advantages of the spectral parameters obtained by frequency filtering (FF) of the logarithmic filter-bank energies (logFBEs) have been reported. These parameters, which are frequency derivatives of the lofFBEs, lie in the frequency domain, and have shown good recognition performance with repect to the conventional MFCCs for HMM systems. In this paper, the FF features are first compared with the MFCCs and the Rasta-PLP features using both a hybrid HMM/MLP and a usual HMM/GMM recognition system, for both clean and noisy speech. Taking advantage of the ability of the hybrid system to deal with correlated features, the inclusion of both the frequency second-derivatives and the raw logFBes as additional features is proposed and tested. Moreover, the robustness of these features in noisy conditions is enhanced by combining the FF technique with the Rasta temporal filtering approach. Finally, a study of the FF features in the framework of multi-stram processing is presented. The best recognition results for both clean and noisy speech are obtained from the multi-stream combination of the J-Rasta-PLP features and the FF features.

Details

Title Comparison and Combination of Features in a Hybrid HMM/MLP and a HMM/GMM Speech Recognition System

Author(s) Pujol, Pere ; Pol, Susagna ; Nadeu, Climent ; Hagen, Astrid ; Bourlard, Hervé

Date 2003

Publisher IDIAP

Keywords

speech

Note IEEE Transactions on Speech and Audio Processing

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports
Published

Record creation date 2006-03-10

Files

Abstract

Details

PDF