Speaker Embeddings as Individuality Proxy for Voice Stress Detection

Wu, Zihan; Scheidwasser-Clow, Neil; El Hajal, Karl; Cernak, Milos

doi:10.21437/Interspeech.2023-2070

conference paper

Speaker Embeddings as Individuality Proxy for Voice Stress Detection

Wu, Zihan

•

Scheidwasser-Clow, Neil

•

El Hajal, Karl

January 1, 2023

Interspeech 2023

Interspeech Conference

Since the mental states of the speaker modulate speech, stress introduced by cognitive or physical loads could be detected in the voice. The existing voice stress detection benchmark has shown that the audio embeddings extracted from the Hybrid BYOL-S self-supervised model perform well. However, the benchmark only evaluates performance separately on each dataset, but does not evaluate performance across the different types of stress and different languages. Moreover, previous studies found strong individual differences in stress susceptibility. This paper presents the design and development of voice stress detection, trained on more than 100 speakers from 9 language groups and five different types of stress. We address individual variabilities in voice stress analysis by adding speaker embeddings to the hybrid BYOL-S features. The proposed method significantly improves voice stress detection performance with an input audio length of only 3-5 seconds.

Type

conference paper

DOI

10.21437/Interspeech.2023-2070

Web of Science ID

WOS:001186650301198

Author(s)

Wu, Zihan

École Polytechnique Fédérale de Lausanne

Scheidwasser-Clow, Neil

University of Copenhagen

El Hajal, Karl

École Polytechnique Fédérale de Lausanne

Cernak, Milos

Logitech International S.A.

Date Issued

2023-01-01

Publisher

Isca-Int Speech Communication Assoc

Publisher place

Baixas

Published in

Interspeech 2023

ISBN of the book

Series title/Series vol.

Interspeech

ISSN (of the series)

2308-457X

Start page

1838

End page

1842

Subjects

speech recognition

•

human-computer interaction

•

computational paralinguistics

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

LCN1

LIDIAP

Event name	Event acronym	Event place	Event date
Interspeech Conference		Dublin, IRELAND	2023-08-20 - 2023-08-24

Available on Infoscience

January 31, 2025

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/246168