Combining vocal tract length normalization with hierarchial linear transformations

Saheer, Lakshmi; Yamagishi, Junichi; Garner, Philip N.; Dines, John

doi:10.1109/ICASSP.2012.6287948

Saheer, Lakshmi; Yamagishi, Junichi; Garner, Philip N.; Dines, John

2012

Télécharger

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Fichiers

Résumé

Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR based adaptation techniques, being much closer in quality to that generated by the original average voice model. However with only a single parameter, VTLN captures very few speaker specific characteristics when compared to linear transform based adaptation techniques. This paper proposes that the merits of VTLN can be combined with those of linear transform based adaptation in a hierarchial Bayesian framework, where VTLN is used as the prior information. A novel technique for propagating the gender information from the VTLN prior through constrained structural maximum aposteriori linear regression (CSMAPLR) adaptation is presented. Experiments show that the resulting transformation has improved speech quality with better naturalness, intelligibility and improved speaker similarity.

Détails

Titre Combining vocal tract length normalization with hierarchial linear transformations

Auteur(s) Saheer, Lakshmi ; Yamagishi, Junichi ; Garner, Philip N. ; Dines, John

Publié dans 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pages 4493-4496

Présenté à International conference on Speech and Signal processing, Kyoto, Japan

Date 2012

Editeur IEEE

Mots-clés (libres)

constrained structural maximum a posteriori linear regression; hidden Markov models; speaker adaptation; Statistical parametric speech synthesis; vocal tract length normalization

DOI https://doi.org/10.1109/ICASSP.2012.6287948

Lien supplémentaire Related documents

Laboratoires LIDIAP

Le document apparaît dans Production scientifique et compétences > STI - Faculté des sciences et techniques de l'ingénieur > IEM - Institute of Electrical and Micro Engineering > LIDIAP - Laboratoire de l'IDIAP
Production scientifique et compétences > Euler Center for Signal Processing
Papiers de conférence
Travail produit à l'EPFL

Date de création de la notice 2013-12-19

Actions

Aperçu

Sélectionner le fichier :