Learning Multi-Modal Dictionaries: Application to Audiovisual Data

This paper presents a methodology for extracting meaningful synchronous structures from multi-modal signals. Simultaneous processing of multi-modal data can reveal information that is unavailable when handling the sources separately. However, in natural high-dimensional data, the statistical dependencies between modalities are, most of the time, not obvious. Learning fundamental multi-modal patterns is an alternative to classical statistical methods. Typically, recurrent patterns are shift invariant, thus the learning should try to find the best matching filters. We present a new algorithm for iteratively learning multi-modal generating functions that can be shifted at all positions in the signal. The proposed algorithm is applied to audiovisual sequences and it demonstrates to be able to discover underlying structures in the data.


Published in:
Proc. of International Workshop on Multimedia Content Representation, Classification and Security, in Springer-Verlag LNCS series, 4105, 538-545
Year:
2006
Keywords:
Other identifiers:
Laboratories:




 Record created 2006-10-27, last modified 2018-03-17

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)