Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Analysis of Multimodal Sequences Using Geometric Video Representations
 
research article

Analysis of Multimodal Sequences Using Geometric Video Representations

Monaci, G.  
•
Divorra Escoda, O.  
•
Vandergheynst, P.  
2006
Signal Processing

This paper presents a novel method to correlate audio and visual data generated by the same physical phenomenon, based on sparse geometric representation of video sequences. The video signal is modeled as a sum of geometric primitives evolving through time, that jointly describe the geometric and motion content of the scene. The displacement through time of relevant visual features, like the mouth of a speaker, can thus be compared with the evolution of an audio feature to assess the correspondence between acoustic and visual signals. Experiments show that the proposed approach allows to detect and track the speaker's mouth when several persons are present on the scene, in presence of distracting motion, and without prior face or mouth detection.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

Monaci2005_1303.pdf

Access type

openaccess

Size

632.25 KB

Format

Adobe PDF

Checksum (MD5)

92586fa299888f816d09d584bb3ca18e

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés