Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models
 
conference paper

XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models

Kumar, Shashi
•
Madikeri, Srikanth
•
Zuluaga Gomez, Juan Pablo  
Show more
Rao, Bhaskar D
•
Trancoso, Isabel
Show more
2025
Proceedings of the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self-supervised pretrained models exhibit competitive performance in automatic speech recognition (ASR) on finetuning, even with limited in-domain supervised data. However, popular pretrained models are not suitable for streaming ASR because they are trained with full attention context. In this paper, we introduce XLSR-Transducer, where the XLSR-53 model is used as encoder in transducer setup. Our experiments on the AMI dataset reveal that the XLSR-Transducer achieves 4% absolute WER improvement over Whisper large-v2 and 8% over a Zipformer transducer model trained from scratch. To enable streaming capabilities, we investigate different attention masking patterns in the self-attention computation of transformer layers within the XLSR-53 model. We validate XLSR-Transducer on AMI and 5 languages from CommonVoice under low-resource scenarios. Finally, with the introduction of attention sinks, we reduce the left context by half while achieving a relative 12% improvement in WER.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

2407.04439

Type

Main Document

Version

http://purl.org/coar/version/c_ab4af688f83e57aa

Access type

openaccess

License Condition

CC BY

Size

515.76 KB

Format

Unknown

Checksum (MD5)

8ae382c511329d8e75758a66aab9eb17

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés