Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. A Musically Motivated Mid-Level Representation for Pitch Estimation and Musical Audio Source Separation
 
research article

A Musically Motivated Mid-Level Representation for Pitch Estimation and Musical Audio Source Separation

Durrieu, Jean-Louis  
•
David, Bertrand
•
Richard, Gaël
2011
IEEE Journal of Selected Topics in Signal Processing

When designing an audio processing system, the target tasks often influence the choice of a data representation or transformation. Low-level time-frequency representations such as the short-time Fourier transform (STFT) are popular, because they offer a meaningful insight on sound properties for a low computational cost. Conversely, when higher level semantics, such as pitch, timbre or phoneme, are sought after, representations usually tend to enhance their discriminative characteristics, at the expense of their invertibility. They become so-called mid-level representations. In this paper, a source/filter signal model which provides a mid-level representation is proposed. This representation makes the pitch content of the signal as well as some timbre information available, hence keeping as much information from the raw data as possible. This model is successfully used within a main melody extraction system and a lead instrument/accompaniment separation system. Both frameworks obtained top results at several international evaluation campaigns.

  • Details
  • Metrics
Type
research article
DOI
10.1109/JSTSP.2011.2158801
Web of Science ID

WOS:000295012900009

Author(s)
Durrieu, Jean-Louis  
David, Bertrand
Richard, Gaël
Date Issued

2011

Published in
IEEE Journal of Selected Topics in Signal Processing
Volume

5

Issue

6

Start page

1180

End page

1191

Subjects

Audio melody extraction

•

audio signal representation

•

musical audio source separation

•

non-negative matrix factorization (NMF)

•

pitch estimation

•

Nonnegative Matrix Factorization

•

Polyphonic Music

•

Signals

•

Melody

•

Identification

•

Transcription

•

Similarity

•

Sounds

URL

URL

http://durrieu.ch/research/jstsp2010.html
Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LTS5  
Available on Infoscience
September 27, 2011
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/71136
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés