Improving Speech Recognition Using a Data-Driven Approach

Aradilla, Guillermo; Vepa, Jithendra; Bourlard, Hervé

doi:10.21437/Interspeech.2005-856

Aradilla, Guillermo; Vepa, Jithendra; Bourlard, Hervé

2005

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this paper, we investigate the possibility of enhancing state-of-the-art HMM-based speech recognition systems using data-driven techniques, where whole set of training utterances is used as reference models and recognition is then performed through the well-known template matching technique, DTW. This approach allows us to better capture the temporal dynamics of the speech signal while avoiding some of the HMM assumptions such as the piecewise stationarity. Potentially, such data-driven techniques also allow us to better exploit meta-data and environmental information, such as speaker, gender, accent and noise conditions. However, we cannot entirely abandon HMMs, which are very powerful and scalable models. Thus, we investigate one way to combine and take advantage of both the approaches, combining scores of HMMs and reference templates. Experiments on the Numbers95 database showed that this combination yields 22\% relative improvement in word error rate over the baseline HMM performance. Applying K-means clustering to the acoustic vectors speeds up the decoding, while still retaining a significant improvement in the recognition accuracy.

Details

Title Improving Speech Recognition Using a Data-Driven Approach

Author(s) Aradilla, Guillermo ; Vepa, Jithendra ; Bourlard, Hervé

Published in Proceedings of Interspeech 2005

Pages 3333-3336

Conference Interspeech 2005

Date 2005

Publisher Martigny, Switzerland

Keywords

speech; aradilla; vepa; bourlard

Note IDIAP-RR 05-66

DOI https://doi.org/10.21437/Interspeech.2005-856

Additional link URL; Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2006-03-10

Actions

Preview

Select file: