Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Reports, Documentation, and Standards
  4. Clustering And Segmenting Speakers And Their Locations In Meetings
 
report

Clustering And Segmenting Speakers And Their Locations In Meetings

Ajmera, Jitendra
•
Lathoud, Guillaume  
•
McCowan, Iain A.
2003

This paper presents a new approach toward automatic annotation of meetings in terms of speaker identities and their locations. This is achieved by segmenting the audio recordings using two independent sources of information : magnitude spectrum analysis and sound source localization. We combine the two in an appropriate HMM framework. There are three main advantages of this approach. First, it is completely unsupervised, i.e. speaker identities and number of speakers and locations are automatically inferred. Second, it is threshold-free, i.e. the decisions are made without the need of a threshold value which generally requires an additional development dataset. The third advantage is that the joint segmentation improves over the speaker segmentation derived using only acoustic features. Experiments on a series of meetings recorded in the IDIAP Smart Meeting Room demonstrate the effectiveness of this approach.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

idiap-rr-03-55.pdf

Access type

openaccess

Size

230.07 KB

Format

Adobe PDF

Checksum (MD5)

d3d24ecce4e98eb607016bd4e46b7a93

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés