Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Adjustable deterministic pseudonymization of speech
 
research article

Adjustable deterministic pseudonymization of speech

Dubagunta, S. Pavankumar
•
van Son, Rob J. J. H.
•
Magimai-Doss, Mathew  
March 1, 2022
Computer Speech And Language

While public speech resources become increasingly available, there is a growing interest to preserve the privacy of the speakers, through methods that anonymize the speaker information from speech while preserving the spoken linguistic content. In this paper, a method for pseudonymization (reversible anonymization) of speech is presented, that allows to obfuscate the speaker identity in untranscribed running speech. The approach manipulates the spectrotemporal structure of the speech to simulate a different length and structure of the vocal tract by modifying the formant locations, as well as by altering the pitch and speaking rate. The method is deterministic and partially reversible, and the changes are adjustable on a continuous scale. The method has been evaluated in terms of (i) ABX listening experiments, and (ii) automatic speaker verification and speech recognition. ABX experimental results indicate that the speaker identifiability among forced choice pairs reduced from over 90% to less than 70% through pseudonymization, and that de-pseudonymization was partially effective. An evaluation on the VoicePrivacy 2020 challenge data showed that the proposed approach performs better than the signal processing based baseline method that uses McAdams coefficient and performs slightly worse than the neural source filtering based baseline method. Further analysis showed that the proposed approach: (i) is comparable to the neural source filtering baseline based method in terms of phone posterior feature based objective intelligibility measure, (ii) preserves formant tracks better than the McAdams based method, and (iii) preserves paralinguistic aspects such as dysarthria in several speakers.

  • Details
  • Metrics
Type
research article
DOI
10.1016/j.csl.2021.101284
Web of Science ID

WOS:000728821200003

Author(s)
Dubagunta, S. Pavankumar
van Son, Rob J. J. H.
Magimai-Doss, Mathew  
Date Issued

2022-03-01

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

Published in
Computer Speech And Language
Volume

72

Article Number

101284

Subjects

Computer Science, Artificial Intelligence

•

Computer Science

•

speech privacy

•

speech pseudonymization

•

speech signal processing

•

speech features

•

identification

•

articulation

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Available on Infoscience
January 31, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/184961
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés