Using entropy as a stream reliability estimate for audio-visual speech recognition

We present a method for dynamically integrating audio-visual information for speech recognition, based on the estimated reliability of the audio and visual streams. Our method uses an information theoretic measure, the entropy derived from the state probability distribution for each stream, as an estimate of reliability. The two modalities, audio and video, are weighted at each time instant according to their reliability. In this way, the weights vary dynamically and are able to adapt to any type of noise in each modality, and more importantly, to unexpected variations in the level of noise.


Published in:
16th European Signal Processing Conference
Presented at:
16th European Signal Processing Conference, Lausanne, Switzerland, August 25-29, 2008
Year:
2008
Publisher:
Lausanne, Switzerland
Keywords:
Laboratories:




 Record created 2008-06-09, last modified 2018-01-28

External links:
Download fulltextURL
Download fulltextn/a
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)