Sparse non-negative decomposition of speech power spectra for formant tracking

Durrieu, Jean-Louis; Thiran, Jean-Philippe

doi:10.1109/ICASSP.2011.5947544

Durrieu, Jean-Louis; Thiran, Jean-Philippe

2011

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Many works on speech processing have dealt with auto-regressive (AR) models for spectral envelope and formant frequency estimation, mostly focusing on the estimation of the AR parameters. However, it is also interesting to be able to directly estimate the formant frequencies, or equivalently the poles of the AR filter. To tackle this issue, we propose in this paper to decompose the signal onto several bases, one for each formant, taking advantage of recent works on nonnegative matrix factorization (NMF) for the estimation stage, further refined by sparsity and smoothness penalties. The results are encouraging, and the proposed system provides formant tracks which seem robust enough to be used in different applications such as phonetic analysis, emotion detection or as visual cue for computer-aided pronunciation training applications. The model can also be extended to deal with multiple-speaker signals.

Details

Title Sparse non-negative decomposition of speech power spectra for formant tracking

Author(s) Durrieu, Jean-Louis ; Thiran, Jean-Philippe

Published in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing

Pages 5260-5263

Conference International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, May 22-27, 2011

Date 2011

Publisher Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa

Keywords

Speech Analysis; Autoregressive (AR) Model; Source-Filter Model; Non-negative Matrix Factorization; Sparse Decomposition

DOI https://doi.org/10.1109/ICASSP.2011.5947544

Other identifier(s) View record in Web of Science

Laboratories LTS5

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LTS5 - Signal Processing Laboratory 5
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2011-04-19

Actions

Preview

Select file: