000087376 001__ 87376
000087376 005__ 20190316233753.0
000087376 037__ $$aCONF
000087376 245__ $$aAudiovisual Gestalts
000087376 269__ $$a2006
000087376 260__ $$c2006
000087376 336__ $$aConference Papers
000087376 490__ $$aParallel Computing in Electrical Engineering
000087376 520__ $$aThis paper presents an algorithm to correlate audio and visual data generated by the same physical phenomenon. According to psychophysical experiments, temporal synchrony strongly contributes to integrate cross-modal information in humans. Thus, we define meaningful audiovisual structures as temporally proximal audio-video events. Audio and video signals are represented as sparse decompositions over redundant dictionaries of functions. In this way, it is possible to define perceptually meaningful audiovisual events. The detection of these cross-modal structures is done using a simple rule called Helmholtz principle. Experimental results show that extracting significant synchronous audiovisual events, we can detect the existing cross-modalcorrelation between those signals even in presence of distracting motion and acoustic noise. These results confirm that temporal proximity between audiovisual events is a key ingredient for the integration of information across modalities and that it can be effectively exploited for the design of multi-modal analysis algorithms.
000087376 6531_ $$aLTS2
000087376 700__ $$0241005$$g150417$$aMonaci, G.
000087376 700__ $$aVandergheynst, P.$$g120906$$0240428
000087376 773__ $$tCVPR Workshop on Perceptual Organization in Computer Vision
000087376 8564_ $$uhttps://infoscience.epfl.ch/record/87376/files/Monaci2006_1479.pdf$$zn/a$$s854077
000087376 909C0 $$xU10380$$0252392$$pLTS2
000087376 909CO $$qGLOBAL_SET$$pconf$$ooai:infoscience.tind.io:87376$$pSTI
000087376 937__ $$aEPFL-CONF-87376
000087376 970__ $$aMonaci2006_1479/LTS
000087376 973__ $$rREVIEWED$$sPUBLISHED$$aEPFL
000087376 980__ $$aCONF