Dynamic modality weighting for multi-stream HMMs in Audio- Visual Speech Recognition

Gurban, Mihai; Thiran, Jean-Philippe; Drugman, Thomas; Dutoit, Thierry

doi:10.1145/1452392.1452442

Gurban, Mihai; Thiran, Jean-Philippe; Drugman, Thomas; Dutoit, Thierry

2008

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Merging decisions from different modalities is a crucial problem in Audio-Visual Speech Recognition. To solve this, state synchronous multi-stream HMMs have been proposed for their important advantage of incorporating stream reliability in their fusion scheme. This paper focuses on stream weight adaptation based on modality confidence estimators. We assume different and time-varying environment noise, as can be encountered in realistic applications, and, for this, adaptive methods are best- suited. Stream reliability is assessed directly through classifier outputs since they are not specific to either noise type or level. The influence of constraining the weights to sum to one is also discussed.

Details

Title Dynamic modality weighting for multi-stream HMMs in Audio- Visual Speech Recognition

Author(s) Gurban, Mihai ; Thiran, Jean-Philippe ; Drugman, Thomas ; Dutoit, Thierry

Published in Proceedings of the 10th International Conference on Multimodal Interfaces

Pages 237-240

Conference 10th International Conference on Multimodal Interfaces, Chania, Greece, October 20-22, 2008

Date 2008

Publisher New York, NY, USA, ACM

Keywords

LTS5

DOI https://doi.org/10.1145/1452392.1452442

Additional link URL

Laboratories LTS5

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LTS5 - Signal Processing Laboratory 5
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2008-06-09

Actions

Preview

Select file: