Loading...
report
Audio-visual reliability estimates using stream entropy for speech recognition
2009
We present a method for multimodal fusion based on the estimated reliability of each individual modality. Our method uses an information theoretic measure, the entropy derived from the state probability distribution for each stream, as an estimate of reliability. Our application is audio-visual speech recognition. The two modalities, audio and video, are weighted at each time instant according to their reliability. In this way, the weights vary dynamically and are able to adapt to any type of noise in each modality, and more importantly, to unexpected variations in the level of noise.
Loading...
Name
dyn.pdf
Access type
openaccess
Size
1.04 MB
Format
Adobe PDF
Checksum (MD5)
49921f58fff394bae1a270d530b5707b