Crosslingual Tandem-SGMM: Exploiting Out-Of-Language Data for Acoustic Model and Feature Level Adaptation

Motlicek, Petr; Imseng, David; Garner, Philip N.

Motlicek, Petr; Imseng, David; Garner, Philip N.

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Recent studies have shown that speech recognizers may benefit from data in languages other than the target language through efficient acoustic model- or feature-level adaptation. Crosslingual Tandem-Subspace Gaussian Mixture Models (SGMM) are successfully able to combine acoustic model- and feature-level adaptation techniques. More specifically, we focus on under-resourced languages (Afrikaans in our case) and perform feature-level adaptation through the estimation of phone class posterior features with a Multilayer Perceptron that was trained on data from a similar language with large amounts of available speech data (Dutch in our case). The same Dutch data can also be exploited on an acoustic model-level by training globally-shared SGMM parameters in a crosslingual way. The two adaptation techniques are indeed complementary and result in a crosslingual Tandem-SGMM system that yields relative improvement of about 22% compared to a standard speech recognizer on an Afrikaans phoneme recognition task. Interestingly, eventual score-level combination of the individual SGMM systems yields additional 3% relative improvement.

Details

Title Crosslingual Tandem-SGMM: Exploiting Out-Of-Language Data for Acoustic Model and Feature Level Adaptation

Author(s) Motlicek, Petr ; Imseng, David ; Garner, Philip N.

Date 2013

Publisher Rue Marconi 19, Martigny, Switzerland, Idiap

Keywords

Acoustic model adaptation; Automatic Speech Recognition; under-resourced languages

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports

Record creation date 2013-12-19

Files

Abstract

Details

PDF