Template-based ASR using Posterior features and synthetic references: comparing different TTS systems

Soldo, Serena; Magimai.-Doss, Mathew; Bourlard, Hervé

Soldo, Serena; Magimai.-Doss, Mathew; Bourlard, Hervé

2012

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In recent works, the use of phone class-conditional posterior probabilities (posterior features) directly as features provided successful results in template-based ASR systems. In this paper, motivated by the high quality of current text-to-speech systems and the robustness of posterior features toward undesired variability, we investigate the use of synthetic speech to generate reference templates. The use of synthetic speech in template-based ASR not only allows to address the issue of in-domain data collection but also expansion of vocabulary. On 75- and 600-word task-independent and speaker-independent setup of Phonebook corpus, we show the feasibility of this approach by investigating different synthetic voices produced by HTS-based synthesizer trained on two different databases. Our study shows that synthetic speech templates can yield performance comparable to the natural speech templates, especially with synthetic voices that have high intelligibility.

Details

Title Template-based ASR using Posterior features and synthetic references: comparing different TTS systems

Author(s) Soldo, Serena ; Magimai.-Doss, Mathew ; Bourlard, Hervé

Published in SAPA-SCALE conference (SAPA 2012)

Pages 52-57

Conference SAPA-SCALE Conference, International Speech Communication Association

Date 2012

Keywords

Posterior features; speech recognition; synthetic reference templates.; template-based approach

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL

Record creation date 2013-12-19

Files

Abstract

Details

PDF