Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Towards Leveraging Sequential Structure in Animal Vocalizations
 
conference paper not in proceedings

Towards Leveraging Sequential Structure in Animal Vocalizations

Sarkar, Eklavya  
•
Magimai Doss, Mathew  
December 6, 2025
AI for Non-Human Animal Communication. NeurIPS 2025 Workshop

Animal vocalizations contain sequential structures that carry important communicative information, yet most computational bioacoustics studies average the extracted frame-level features across the temporal axis, discarding the order of the sub-units within a vocalization. This paper investigates whether discrete acoustic token sequences, derived through vector quantization and gumbel-softmax vector quantization of extracted self-supervised speech model representations can effectively capture and leverage temporal information. To that end, pairwise distance analysis of token sequences generated from HuBERT embeddings shows that they can discriminate call-types and callers across four bioacoustics datasets. Sequence classification experiments using k-Nearest Neighbour with Levenshtein distance show that the vector-quantized token sequences yield reasonable call-type and caller classification performances, and hold promise as alternative feature representations towards leveraging sequential information in animal vocalizations.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

Sarkar_NeurIPS_Tokens_2025.pdf

Type

Main Document

Version

Accepted version

Access type

openaccess

License Condition

CC BY

Size

642.12 KB

Format

Adobe PDF

Checksum (MD5)

70a3fe2e72c1fdbcac5c523c4f081c09

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés