Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. A Mixed-State I-Particle Filter for Multi-Camera Speaker Tracking
 
conference paper

A Mixed-State I-Particle Filter for Multi-Camera Speaker Tracking

Gatica-Perez, Daniel  
•
Lathoud, Guillaume  
•
McCowan, Iain A.
Show more
2003
IEEE International Conference on Computer Vision Workshop on Multimedia Technologies for E-Learning and Collaboration (ICCV-WOMTEC)
IEEE International Conference on Computer Vision Workshop on Multimedia Technologies for E-Learning and Collaboration (ICCV-WOMTEC)

Tracking speakers in multi-party conversations represents an important step towards automatic analysis of meetings. In this paper, we present a probabilistic method for audio-visual (AV) speaker tracking in a multi-sensor meeting room. The algorithm fuses information coming from three uncalibrated cameras and a microphone array via a mixed-state importance particle filter, allowing for the integration of AV streams to exploit the complementary features of each modality. Our method relies on several principles. First, a mixed state space formulation is used to define a generative model for camera switching. Second, AV localization information is used to define an importance sampling function, which guides the search process of a particle filter towards regions of the configuration space likely to contain the true configuration (a speaker). Finally, the measurement process integrates shape, color, and audio observations. We show that the principled combination of imperfect modalities results in an algorithm that automatically initializes and tracks speakers engaged in real conversations, reliably switching across cameras and between participants.

  • Files
  • Details
  • Metrics
Type
conference paper
Author(s)
Gatica-Perez, Daniel  
•
Lathoud, Guillaume  
•
McCowan, Iain A.
•
Odobez, Jean-Marc  
Date Issued

2003

Published in
IEEE International Conference on Computer Vision Workshop on Multimedia Technologies for E-Learning and Collaboration (ICCV-WOMTEC)
Subjects

vision

•

speech

URL

URL

http://publications.idiap.ch/downloads/reports/2003/rr03-25.pdf

Related documents

http://publications.idiap.ch/index.php/publications/showcite/gatica03c
Written at

EPFL

EPFL units
LIDIAP  
Event name
IEEE International Conference on Computer Vision Workshop on Multimedia Technologies for E-Learning and Collaboration (ICCV-WOMTEC)
Available on Infoscience
March 10, 2006
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/228281
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés