Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech
 
conference paper

Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech

El Hajal, Karl  
•
Hermann, Enno
•
Hovsepyan, Sevada
Show more
2025
Interspeech 2025
26 Interspeech Conference

Automatic speech recognition (ASR) systems struggle with dysarthric speech due to high inter-speaker variability and slow speaking rates. To address this, we explore dysarthric-to-healthy speech conversion for improved ASR performance. Our approach extends the Rhythm and Voice (RnV) conversion framework by introducing a syllable-based rhythm modeling method suited for dysarthric speech. We assess its impact on ASR by training LF-MMI models and fine-tuning Whisper on converted speech. Experiments on the Torgo corpus reveal that LF-MMI achieves significant word error rate reductions, especially for more severe cases of dysarthria, while fine-tuning Whisper on converted data has minimal effect on its performance. These results highlight the potential of unsupervised rhythm and voice conversion for dysarthric ASR. Code available at: https://github.com/idiap/RnV.

  • Details
  • Metrics
Type
conference paper
DOI
10.21437/Interspeech.2025-2069
Scopus ID

2-s2.0-105020056989

Author(s)
El Hajal, Karl  

EPFL

Hermann, Enno

Institut Dalle Molle D'intelligence Artificielle Perceptive

Hovsepyan, Sevada

Institut Dalle Molle D'intelligence Artificielle Perceptive

Magimai-Doss, Mathew

Institut Dalle Molle D'intelligence Artificielle Perceptive

Date Issued

2025

Publisher

International Speech Communication Association

Published in
Interspeech 2025
Start page

2760

End page

2764

Subjects

Dysarthric Speech Recognition

•

Rhythm Modeling

•

Unsupervised

•

Voice Conversion

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent acronymEvent placeEvent date
26 Interspeech Conference

Rotterdam, Netherlands

2025-08-17 - 2025-08-21

Available on Infoscience
November 7, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/255673
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés