Unknown-Multiple Speaker clustering using HMM

An HMM-based speaker clustering framework is presented, where the number of speakers and segmentation boundaries are unknown \emph{a priori}. Ideally, the system aims to create one pure cluster for each speaker. The HMM is ergodic in nature with a minimum duration topology. The final number of clusters is determined automatically by merging closest clusters and retraining this new cluster, until a decrease in likelihood is observed. In the same framework, we also examine the effect of using only the features from highly voiced frames as a means of improving the robustness and computational complexity of the algorithm. The proposed system is assessed on the 1996 HUB-4 evaluation test set in terms of both cluster and speaker purity. It is shown that the number of clusters found often correspond to the actual number of speakers.


Published in:
ICSLP, 573-576
Presented at:
ICSLP
Year:
2002
Publisher:
Denver, Colorado
Keywords:
Note:
IDIAP-RR 02-07
Laboratories:




 Record created 2006-03-10, last modified 2018-03-17

n/a:
Download fulltextPDF
External links:
Download fulltextURL
Download fulltextRelated documents
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)