Speaker verification in score-ageing-quality classification space

Kelly, Finnian; Drygajlo, Andrzej; Harte, Naomi

doi:10.1016/j.csl.2012.12.005

research article

Speaker verification in score-ageing-quality classification space

Kelly, Finnian

•

Drygajlo, Andrzej

•

Harte, Naomi

2013

Computer Speech And Language

A challenge in automatic speaker verification is to create a system that is robust to the effects of vocal ageing. To observe the ageing effect, a speaker's voice must be analysed over a period of time, over which, variation in the quality of the voice samples is likely to be encountered. Thus, in dealing with the ageing problem, the related issue of quality must also be addressed. We present a solution to speaker verification across ageing by using a stacked classifier framework to combine ageing and quality information with the scores of a baseline classifier. In tandem, the Trinity College Dublin Speaker Ageing database of 18 speakers, each covering a 30-60 year time range, is presented. An evaluation of a baseline Gaussian Mixture Model Universal Background Model (GMM-UBM) system using this database demonstrates a progressive degradation in genuine speaker verification scores as ageing progresses. Consequently, applying a conventional threshold, determined using scores at the time of enrolment, results in poor long-term performance. The influence of quality on verification scores is investigated via a number of quality measures. Alongside established signal-based measures, a new model-based measure, Wnorm, is proposed, and its utility is demonstrated on the CSLU database. Combining ageing information with quality measures and the scores from the GMM-UBM system, a verification decision boundary is created in score-ageing-quality space. The best performance is achieved by using scores and ageing in conjunction with the new Wnorm quality measure, reducing verification error by 45% relative to the baseline. This work represents the first comprehensive analysis of speaker verification on a longitudinal speaker database and successfully addresses the associated variability from ageing and quality arte-facts. (C) 2013 Elsevier Ltd. All rights reserved.

Type

research article

DOI

10.1016/j.csl.2012.12.005

Web of Science ID

WOS:000318139300002

Authors

Kelly, Finnian

•

Drygajlo, Andrzej

•

Harte, Naomi

Publication date

2013

Publisher

Academic Press Ltd- Elsevier Science Ltd

Published in

Computer Speech And Language

Volume

27

Issue

5

Start page

1068

End page

1084

Subjects

Speaker verification

Ageing

Quality measures

Peer reviewed

REVIEWED

EPFL units

LIDIAP

Available on Infoscience

October 1, 2013

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/95713