Unknown-Multiple Speaker clustering using HMM

Ajmera, Jitendra; Bourlard, Hervé; Lapidot, I.; McCowan, Iain A.

doi:10.21437/ICSLP.2002-195

Ajmera, Jitendra; Bourlard, Hervé; Lapidot, I.; McCowan, Iain A.

2002

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

An HMM-based speaker clustering framework is presented, where the number of speakers and segmentation boundaries are unknown \emph{a priori}. Ideally, the system aims to create one pure cluster for each speaker. The HMM is ergodic in nature with a minimum duration topology. The final number of clusters is determined automatically by merging closest clusters and retraining this new cluster, until a decrease in likelihood is observed. In the same framework, we also examine the effect of using only the features from highly voiced frames as a means of improving the robustness and computational complexity of the algorithm. The proposed system is assessed on the 1996 HUB-4 evaluation test set in terms of both cluster and speaker purity. It is shown that the number of clusters found often correspond to the actual number of speakers.

Details

Title Unknown-Multiple Speaker clustering using HMM

Author(s) Ajmera, Jitendra ; Bourlard, Hervé ; Lapidot, I. ; McCowan, Iain A.

Published in 7th International Conference on Spoken Language Processing (ICSLP 2002)

Pages 573-576

Conference ICSLP, Denver, Colorado

Date 2002

Keywords

speech; ajmera; bourlard; lapidot; mccowan

Note IDIAP-RR 02-07

DOI https://doi.org/10.21437/ICSLP.2002-195

Additional link URL; Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2006-03-10

Files

Abstract

Details

PDF