Improving Children Speech Recognition through Feature Learning from Raw Speech Signal

Dubagunta, S. Pavankumar; Kabil, Selen Hande; Magimai.-Doss, Mathew

doi:10.1109/ICASSP.2019.8682826

Dubagunta, S. Pavankumar; Kabil, Selen Hande; Magimai.-Doss, Mathew

2019

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Children speech recognition based on short-term spectral features is a challenging task. One of the reasons is that children speech has high fundamental frequency that is comparable to formant frequency values. Furthermore, as children grow, their vocal apparatus also undergoes changes. This presents difficulties in extracting standard short-term spectral-based features reliably for speech recognition. In recent years, novel acoustic modeling methods have emerged that learn both the feature and phone classifier in an end-to-end manner from the raw speech signal. Through an investigation on PF-STAR corpus we show that children speech recognition can be improved using end-to-end acoustic modeling methods.

Details

Title Improving Children Speech Recognition through Feature Learning from Raw Speech Signal

Author(s) Dubagunta, S. Pavankumar ; Kabil, Selen Hande ; Magimai.-Doss, Mathew

Published in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (Icassp)

Pages 5736-5740

Conference Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 12-17, 2019, Brighton, ENGLAND

Date 2019

Publisher New York, IEEE

Keywords

acoustic modeling; Children speech recognition; Convolutional Neural Networks; end-to-end training

DOI https://doi.org/10.1109/ICASSP.2019.8682826

Other identifier(s) View record in Web of Science

Additional link Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2019-02-25

Abstract

Details

Actions