Audio Novelty-Based Segmentation of Music Concerts

El Badawy, Dalia; Marmaroli, Patrick; Lissek, Hervé

El Badawy, Dalia; Marmaroli, Patrick; Lissek, Hervé

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

The Swiss Federal Institute of Technology in Lausanne (EPFL) is in the process of digitizing an exceptional collection of audio and video recordings of the Montreux Jazz Festival (MJF) concerts. Since 1967, five thousand hours of both audio and video have been recorded with about 60% digitized so far. In order to make these archives easily manageable, ensure the correctness of the supplied metadata, and facilitate copyright management, one of the desired tasks is to know exactly how many songs are present in a given concert, and identify them individually, even in very problematic cases (such as medleys or long improvisational periods). However, due to the sheer amount of recordings to process, it is a quite cumbersome and time consuming task to have a person listen to each concert and identify every song. Consequently, it is essential to automate the process. To that end, this paper describes a strategy for automatically detecting the most important changes in an audio file of concert; for MJF concerts, those changes correspond to song transitions, interludes, or applause. The presented method belongs to the family of audio novelty-based segmentation methods. The general idea is to first divide a whole concert into short frames, each of a few milliseconds length, from which well-chosen audio features are extracted. Then, a similarity matrix is computed which provides information about the similarities between each pair of frames. Next, a kernel is correlated along the diagonal of the similarity matrix to determine the audio novelty scores. Finally, peak detection is used to find significant peaks in the scores which are suggestive of a change. The main advantage of such a method is that no training step is required as opposed to most of the classical segmentation algorithms. Additionally, relatively few audio features are needed which leads to a reduction in the amount of computation and run time. It is expected that such a preprocessing shall speed up the song identification process: instead of having to listen to hours of music, the algorithm will produce markings to indicate where to start listening. The presented method is evaluated using real concert recordings that have been segmented by hand; and its performance is compared to the state-of-the-art.

Details

Title Audio Novelty-Based Segmentation of Music Concerts

Author(s) El Badawy, Dalia ; Marmaroli, Patrick ; Lissek, Hervé

Conference Acoustics 2013, New Delhi, November 10-15, 2013

Date 2013

Keywords

audio signal processing; automatic audio segmentation

Laboratories LEMA

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LEMA - Laboratory of ElectroMagnetics and Antennas
Conference Papers
Work produced at EPFL
Published

Record creation date 2013-12-04

Files

Abstract

Details

PDF