Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Serab: A Multi-Lingual Benchmark For Speech Emotion Recognition
 
conference paper

Serab: A Multi-Lingual Benchmark For Speech Emotion Recognition

Scheidwasser-Clow, Neil
•
Kegler, Mikolaj
•
Beckmann, Pierre  
Show more
January 1, 2022
2022 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)
47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recent developments in speech emotion recognition (SER) often leverage deep neural networks (DNNs). Comparing and benchmarking different DNN models can often be tedious due to the use of different datasets and evaluation protocols. To facilitate the process, here, we present the Speech Emotion Recognition Adaptation Benchmark (SERAB), a framework for evaluating the performance and generalization capacity of different approaches for utterance-level SER. The benchmark is composed of nine datasets for SER in six languages. Since the datasets have different sizes and numbers of emotional classes, the proposed setup is particularly suitable for estimating the generalization capacity of pre-trained DNN-based feature extractors. We used the proposed framework to evaluate a selection of standard hand-crafted feature sets and state-of-the-art DNN representations. The results highlight that using only a subset of the data included in SERAB can result in biased evaluation, while compliance with the proposed protocol can circumvent this issue.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/ICASSP43922.2022.9747348
Web of Science ID

WOS:000864187908001

Author(s)
Scheidwasser-Clow, Neil
•
Kegler, Mikolaj
•
Beckmann, Pierre  
•
Cernak, Milos
Date Issued

2022-01-01

Publisher

IEEE

Publisher place

New York

Published in
2022 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)
ISBN of the book

978-1-6654-0540-9

Series title/Series vol.

International Conference on Acoustics Speech and Signal Processing ICASSP

Start page

7697

End page

7701

Subjects

Acoustics

•

Computer Science, Artificial Intelligence

•

Engineering, Electrical & Electronic

•

Computer Science

•

Engineering

•

emotion recognition

•

computational paralinguistics

•

deep neural networks

•

speech processing

•

transfer learning

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSIR  
Event nameEvent placeEvent date
47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Singapore, SINGAPORE

May 22-27, 2022

Available on Infoscience
January 16, 2023
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/193824
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés