VTLN Adaptation for Statistical Speech Synthesis

Saheer, Lakshmi; Garner, Philip N.; Dines, John; Liang, Hui

doi:10.1109/ICASSP.2010.5495126

Saheer, Lakshmi; Garner, Philip N.; Dines, John; Liang, Hui

2010

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

The advent of statistical speech synthesis has enabled the unification of the basic techniques used in speech synthesis and recognition. Adaptation techniques that have been successfully used in recognition systems can now be applied to synthesis systems to improve the quality of the synthesized speech. The application of vocal tract length normalization (VTLN) for synthesis is explored in this paper. VTLN based adaptation requires estimation of a single warping factor, which can be accurately estimated from very little adaptation data and gives additive improvements over CMLLR adaptation. The challenge of estimating accurate warping factors using higher order features is solved by initializing warping factor estimation with the values calculated from lower order features.

Details

Title VTLN Adaptation for Statistical Speech Synthesis

Author(s) Saheer, Lakshmi ; Garner, Philip N. ; Dines, John ; Liang, Hui

Published in Proceedings of ICASSP

Pages 4838-4841

Date 2010

DOI https://doi.org/10.1109/ICASSP.2010.5495126

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2010-02-11

Files

Abstract

Details

PDF