Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition

Dubagunta, S. Pavankumar; Magimai.-Doss, Mathew

doi:10.1109/ICASSP.2019.8683513

conference paper

Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition

Dubagunta, S. Pavankumar

•

Magimai.-Doss, Mathew

2019

2019 IEEE International Conference On Acoustics, Speech And Signal Processing (Icassp)

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

We show that confidence measures estimated from local posterior probabilities can serve as objective functions for training ANNs in hybrid HMM based speech recognition systems. This leads to a segment-level training paradigm that overcomes the limitation of frame-level updates ignoring the sequence structure in speech. We propose measures that train at the state and phone segment levels, while still decoding in the conventional framework. Experimental results on multiple corpora show that such trainings not only yield better systems in terms of performance, but also give additional improvements with sequence discriminative training. These techniques generalise across front-ends and model architectures, and efficiently handle the effect of segment duration variations on the ANN training.

Type

conference paper

DOI

10.1109/ICASSP.2019.8683513

Web of Science ID

WOS:000482554006133

Authors

Dubagunta, S. Pavankumar

•

Magimai.-Doss, Mathew

Publication date

2019

Publisher

IEEE

Published in

2019 IEEE International Conference On Acoustics, Speech And Signal Processing (Icassp)

Publisher place

New York

Start page

6435

End page

6439

Subjects

confidence measures

local posterior proba...

segment-level trainin...

speech recognition

URL

Event name	Event place	Event date
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)	Brighton, ENGLAND	May 12-17, 2019