Analysis of Multimodal Signals Using Redundant Representations [Winner of IBM Student Paper Award]

In this work we explore the potentialities of a framework for the representation of audio-visual signals using decompositions on overcomplete dictionaries. Redundant decompositions may describe audio-visual sequences in a concise fashion, preserving good representation properties thanks to the use of redundant, well designed, dictionaries. We expect that this will help us overcome two typical problems of multimodal fusion algorithms. On one hand, classical representation techniques, like pixel-based measures (for the video) or Fourier-like transforms (for the audio), take into account only marginally the physics of the problem. On the other hand, the input signals have large dimensionality. The results we obtain by making use of sparse decompositions of audio-visual signals over redundant codebooks are encouraging and show the potentialities of the proposed approach to multimodal signal representation.


Published in:
Proc. of IEEE International Conference on Image Processing (ICIP '05), Genova
Year:
2005
Keywords:
Note:
Winner of IBM Student Paper Award
Laboratories:




 Record created 2006-06-14, last modified 2018-03-17

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)