Cross-database evaluation of audio-based spoofing detection systems

Korshunov, Pavel; Marcel, Sébastien

doi:10.21437/Interspeech.2016-1326

conference paper

Cross-database evaluation of audio-based spoofing detection systems

Korshunov, Pavel

•

Marcel, Sébastien

2016

Interspeech 2016

Interspeech

Since automatic speaker verification (ASV) systems are highly vulnerable to spoofing attacks, it is important to develop mechanisms that can detect such attacks. To be practical, however, a spoofing attack detection approach should have (i) high accuracy, (ii) be well-generalized for practical attacks, and (iii) be simple and efficient. Several audio-based spoofing detection methods have been proposed recently but their evaluation is limited to less realistic databases containing homogeneous data. In this paper, we consider eight existing presentation attack detection (PAD) methods and evaluate their performance using two major publicly available speaker databases with spoofing attacks: AVspoof and ASVspoof. We first show that realistic presentation attacks (speech is replayed to PAD system) are significantly more challenging for the considered PAD methods compared to the so called `logical access' attacks (speech is presented to PAD system directly). Then, via a cross-database evaluation, we demonstrate that the existing methods generalize poorly when different databases or different types of attacks are used for training and testing. The results question the efficiency and practicality of the existing PAD systems, as well as, call for creation of databases with larger variety of realistic speech presentation attacks.

Name

Korshunov_INTERSPEECH_2016.pdf

Access type

openaccess

Size

481.17 KB

Format

Adobe PDF

Checksum (MD5)

924ef6be863a0c7c0f87ae7607666d3a