Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Scalable Music Cover Retrieval Using Lyrics-Aligned Audio Embeddings
 
conference paper

Scalable Music Cover Retrieval Using Lyrics-Aligned Audio Embeddings

Affolter, Joanne  
•
Martin, Benjamin
•
Epure, Elena V.
Show more
Campos, Ricardo
•
Jatowt, Adam
Show more
2026
Advances in Information Retrieval. 48th European Conference on Information Retrieval, ECIR 2026, Delft, The Netherlands, March 29 – April 2, 2026. Proceedings, Part I
48th European Conference on Information Retrieval (ECIR 2026)

Music Cover Retrieval, also known as Version Identification, aims to recognize distinct renditions of the same underlying musical work, a task central to catalog management, copyright enforcement, and music retrieval. State-of-the-art approaches have largely focused on harmonic and melodic features, employing increasingly complex audio pipelines designed to be invariant to musical attributes that often vary widely across covers. While effective, these methods demand substantial training time and computational resources. By contrast, lyrics constitute a strong invariant across covers, though their use has been limited by the difficulty of extracting them accurately and efficiently from polyphonic audio. Early methods relied on simple frameworks that limited downstream performance, while more recent systems deliver stronger results but require large models integrated within complex multimodal architectures. We introduce LIVI (Lyrics-Informed Version Identification), an approach that seeks to balance retrieval accuracy with computational efficiency. First, LIVI leverages supervision from state-of-the-art transcription and text embedding models during training to achieve retrieval accuracy on par with—or superior to—harmonic-based systems. Second, LIVI remains lightweight and efficient by removing the transcription step at inference, challenging the dominance of complexity-heavy pipelines.

  • Details
  • Metrics
Type
conference paper
DOI
10.1007/978-3-032-21289-4_4
Scopus ID

2-s2.0-105035386676

Author(s)
Affolter, Joanne  

École Polytechnique Fédérale de Lausanne

Martin, Benjamin

Deezer Research

Epure, Elena V.

Deezer Research

Meseguer-Brocal, Gabriel

Deezer Research

Kaplan, Frédéric  

École Polytechnique Fédérale de Lausanne

Editors
Campos, Ricardo
•
Jatowt, Adam
•
Lan, Yanyan
•
Aliannejadi, Mohammad
•
Bauer, Christine
•
MacAvaney, Sean
•
Anand, Avishek
•
Ren, Zhaochun
•
Verberne, Suzan
•
Bai, Nan
Show more
Date Issued

2026

Publisher

Springer Science and Business Media Deutschland GmbH

Published in
Advances in Information Retrieval. 48th European Conference on Information Retrieval, ECIR 2026, Delft, The Netherlands, March 29 – April 2, 2026. Proceedings, Part I
DOI of the book
https://doi.org/10.1007/978-3-032-21289-4
ISBN of the book

978-3-032-21288-7

978-3-032-21289-4

Series title/Series vol.

Lecture Notes in Computer Science; 16483 LNCS

ISSN (of the series)

1611-3349

0302-9743

Start page

49

End page

66

Subjects

Audio to Text Alignment

•

Music Cover Retrieval

•

Representation Learning

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DHLAB  
Event nameEvent acronymEvent placeEvent date
48th European Conference on Information Retrieval (ECIR 2026)

ECIR 2026

Delft, Netherlands

2026-03-29 - 2026-04-02

Available on Infoscience
April 20, 2026
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/262809
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés