Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking
 
conference paper

AV16.3: an Audio-Visual Corpus for Speaker Localization and Tracking

Lathoud, Guillaume  
•
Odobez, Jean-Marc  
•
Gatica-Perez, Daniel  
Bengio, S.
•
Bourlard, H.
2005
Machine Learning for Multimodal Interaction. MLMI 2004
First International Workshop, MLMI 2004

Assessing the quality of a speaker localization or tracking algorithm on a few short examples is difficult, especially when the ground-truth is absent or not well defined. One step towards systematic performance evaluation of such algorithms is to provide time-continuous speaker location annotation over a series of real recordings, covering various test cases. Areas of interest include audio, video and audio-visual speaker localization and tracking. The desired location annotation can be either 2-dimensional (image plane) or 3-dimensional (physical space). This paper motivates and describes a corpus of audio-visual data called AV16.3'', along with a method for 3-D location annotation based on calibrated cameras. 16.3'' stands for 16 microphones and 3 cameras, recorded in a fully synchronized manner, in a meeting room. Part of this corpus has already been successfully used to report research results.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

lathoud04c.pdf

Access type

openaccess

Size

650.86 KB

Format

Adobe PDF

Checksum (MD5)

fcd13034285caba02f9406a704115895

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés