Using entropy as a stream reliability estimate for audio-visual speech recognition

Gurban, Mihai; Thiran, Jean-Philippe

conference paper

Gurban, Mihai

•

Thiran, Jean-Philippe

2008

16th European Signal Processing Conference

We present a method for dynamically integrating audio-visual information for speech recognition, based on the estimated reliability of the audio and visual streams. Our method uses an information theoretic measure, the entropy derived from the state probability distribution for each stream, as an estimate of reliability. The two modalities, audio and video, are weighted at each time instant according to their reliability. In this way, the weights vary dynamically and are able to adapt to any type of noise in each modality, and more importantly, to unexpected variations in the level of noise.

Name

eusipco08.pdf

Access type

openaccess

Size

87.49 KB

Format

Adobe PDF

Checksum (MD5)

e22684cf4b510a09515279ba6e9445d0