Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Bertraffic: Bert-Based Joint Speaker Role And Speaker Change Detection For Air Traffic Control Communications
 
conference paper

Bertraffic: Bert-Based Joint Speaker Role And Speaker Change Detection For Air Traffic Control Communications

Zuluaga-Gomez, Juan
•
Sarfjoo, Seyyed Saeed
•
Prasad, Amrutha
Show more
January 1, 2022
2022 Ieee Spoken Language Technology Workshop, Slt
IEEE Spoken Language Technology Workshop (SLT)

Automatic speech recognition (ASR) allows transcribing the communications between air traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC named entities, e.g., aircraft callsigns. One common challenge is speech activity detection (SAD) and speaker diarization (SD). In the failure condition, two or more segments remain in the same recording, jeopardizing the overall performance. We propose a system that combines SAD and a BERT model to perform speaker change detection and speaker role detection (SRD) by chunking ASR transcripts, i.e., SD with a defined number of speakers together with SRD. The proposed model is evaluated on real-life public ATC databases. Our BERT SD model baseline reaches up to 10% and 20% token-based Jaccard error rate (JER) in public and private ATC databases. We also achieved relative improvements of 32% and 7.7% in JERs and SD error rate (DER), respectively, compared to VBx, a well-known SD system.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/SLT54892.2023.10022718
Web of Science ID

WOS:000968851900086

Author(s)
Zuluaga-Gomez, Juan
Sarfjoo, Seyyed Saeed
Prasad, Amrutha
Nigmatulina, Iuliia
Motlicek, Petr  
Ondrej, Karel
Ohneiser, Oliver
Helmke, Hartmut
Date Issued

2022-01-01

Publisher

IEEE

Publisher place

New York

Published in
2022 Ieee Spoken Language Technology Workshop, Slt
ISBN of the book

979-8-3503-9690-4

Series title/Series vol.

IEEE Workshop on Spoken Language Technology

Start page

633

End page

640

Subjects

Computer Science, Artificial Intelligence

•

Linguistics

•

Computer Science

•

text-based speaker diarization

•

speaker change detection

•

speaker role detection

•

air traffic control communications

•

chunking

•

diarization

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIDIAP  
Event nameEvent placeEvent date
IEEE Spoken Language Technology Workshop (SLT)

Doha, QATAR

Jan 09-12, 2023

Available on Infoscience
May 22, 2023
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/197754
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés