Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Speaker Embeddings as Individuality Proxy for Voice Stress Detection
 
conference paper

Speaker Embeddings as Individuality Proxy for Voice Stress Detection

Wu, Zihan  
•
Scheidwasser-Clow, Neil
•
El Hajal, Karl  
Show more
January 1, 2023
Interspeech 2023
Interspeech Conference

Since the mental states of the speaker modulate speech, stress introduced by cognitive or physical loads could be detected in the voice. The existing voice stress detection benchmark has shown that the audio embeddings extracted from the Hybrid BYOL-S self-supervised model perform well. However, the benchmark only evaluates performance separately on each dataset, but does not evaluate performance across the different types of stress and different languages. Moreover, previous studies found strong individual differences in stress susceptibility. This paper presents the design and development of voice stress detection, trained on more than 100 speakers from 9 language groups and five different types of stress. We address individual variabilities in voice stress analysis by adding speaker embeddings to the hybrid BYOL-S features. The proposed method significantly improves voice stress detection performance with an input audio length of only 3-5 seconds.

  • Details
  • Metrics
Type
conference paper
DOI
10.21437/Interspeech.2023-2070
Web of Science ID

WOS:001186650301198

Author(s)
Wu, Zihan  

École Polytechnique Fédérale de Lausanne

Scheidwasser-Clow, Neil

University of Copenhagen

El Hajal, Karl  

École Polytechnique Fédérale de Lausanne

Cernak, Milos

Logitech International S.A.

Date Issued

2023-01-01

Publisher

Isca-Int Speech Communication Assoc

Publisher place

Baixas

Published in
Interspeech 2023
ISBN of the book

Series title/Series vol.

Interspeech

ISSN (of the series)

2308-457X

Start page

1838

End page

1842

Subjects

speech recognition

•

human-computer interaction

•

computational paralinguistics

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LCN1  
LIDIAP  
Event nameEvent acronymEvent placeEvent date
Interspeech Conference

Dublin, IRELAND

2023-08-20 - 2023-08-24

Available on Infoscience
January 31, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/246168
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés