DNN-based Speech Synthesis: Importance of input features and training data

Deep neural networks (DNNs) have been recently introduced in speech synthesis. In this paper, an investigation on the importance of input features and training data on speaker dependent (SD) DNN-based speech synthesis is presented. Various aspects of the training procedure of DNNs are investigated in this work. Additionally, several training sets of different size (i.e., 13.5, 3.6 and 1.5 h of speech) are evaluated.


Editor(s):
Ronzhin, A.
Potapova, R.
Fakotakis, N.
Published in:
Speech and Computer
Year:
2015
Publisher:
Springer Berlin Heidelberg
ISBN:
978-3-319-23131-0
Laboratories:




 Record created 2015-06-19, last modified 2018-03-17


Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)