Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings

Improved diarization results can be obtained through combination of multiple systems. Several combination techniques have been proposed based on output voting, initialization and also integrated approaches. This paper proposes and investigates a novel approach to combine diarization systems through the use of features. A first diarization system, based on the Information Bottleneck, is used to generate a set of features that contain information relevant to the clustering. Those features are later used in conjunction with conventional MFCC in a second diarization system. This method is inspired from the TANDEM framework in ASR. While being fully integrated, the approach does not need modifications to any of the two systems in order to integrate the information. Experiments on 24 recordings from the NIST RT06/RT07/RT09 evaluations collected in five meeting rooms reveal that when the IB features are used together with MFCC, the total speaker error is reduced from 12% to 9.7%, i.e., by approximatively 19% relative.

Presented at:
Interspeech, Florence, Italy

 Record created 2013-12-19, last modified 2018-03-17

Rate this document:

Rate this document:
(Not yet reviewed)