Stream fusion for multi-stream automatic speech recognition

Sagha, Hesam; Li, Feipeng; Variani, Ehsan; Millan, Jose Del R.; Chavarriaga, Ricardo; Schuller, Bjoern

doi:10.1007/s10772-016-9357-1

research article

Stream fusion for multi-stream automatic speech recognition

Sagha, Hesam

•

Li, Feipeng

•

Variani, Ehsan

more

2016

International Journal Of Speech Technology

Multi-stream automatic speech recognition (MS-ASR) has been confirmed to boost the recognition performance in noisy conditions. In this system, the generation and the fusion of the streams are the essential parts and need to be designed in such a way to reduce the effect of noise on the final decision. This paper shows how to improve the performance of the MS-ASR by targeting two questions; (1) How many streams are to be combined, and (2) how to combine them. First, we propose a novel approach based on stream reliability to select the number of streams to be fused. Second, a fusion method based on Parallel Hidden Markov Models is introduced. Applying the method on two datasets (TIMIT and RATS) with different noises, we show an improvement of MS-ASR.

Type

research article

DOI

10.1007/s10772-016-9357-1

Web of Science ID

WOS:000387583900002

Authors

Sagha, Hesam

•

Li, Feipeng

•

Variani, Ehsan

•

Millan, Jose Del R.

•

Chavarriaga, Ricardo

•

Schuller, Bjoern

Publication date

2016

Publisher

Springer Verlag

Published in

International Journal Of Speech Technology

Volume

19

Issue

4

Start page

669

End page

675

Subjects

Multi-stream speech r...

Performance monitor

Classifier ensemble c...

Peer reviewed

REVIEWED

EPFL units

CNBI

Available on Infoscience

January 24, 2017

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/133599