Application of Subspace Gaussian Mixture Models in Contrastive Acoustic Scenarios

Motlicek, Petr; Garner, Philip N.; Imseng, David; Valente, Fabio

Motlicek, Petr; Garner, Philip N.; Imseng, David; Valente, Fabio

2012

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This paper describes experimental results of applying Subspace Gaussian Mixture Models (SGMMs) in two completely diverse acoustic scenarios: (a) for Large Vocabulary Continuous Speech Recognition (LVCSR) task over (well-resourced) English meeting data and, (b) for acoustic modeling of underresourced Afrikaans telephone data. In both cases, the performance of SGMM models is compared with a conventional context-dependent HMM/GMM approach exploiting the same kind of information available during the training. LVCSR systems are evaluated on standard NIST Rich Transcription dataset. For under-resourced Afrikaans, SGMM and HMM/GMM acoustic systems are additionally compared to KL-HMM and multilingual Tandem techniques boosted using supplemental out-of-domain data. Experimental results clearly show that the SGMMapproach (having considerably less model parameters) outperforms conventional HMM/GMM system in both scenarios and for all examined training conditions. In case of under-resourced scenario, the SGMM trained only using indomain data is superior to other tested approaches boosted by data from other domain.

Details

Title Application of Subspace Gaussian Mixture Models in Contrastive Acoustic Scenarios

Author(s) Motlicek, Petr ; Garner, Philip N. ; Imseng, David ; Valente, Fabio

Date 2012

Publisher Rue Marconi 19, Martigny, Switzerland, Idiap

Keywords

acoustic modeling; Automatic Speech Recognition; Subs-ace Gaussian Mixture Models

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports

Record creation date 2013-12-19

Files

Abstract

Details

PDF