The Kaldi Speech Recognition Toolkit

Povey, Daniel; Ghoshal, Arnab; Boulianne, Gilles; Burget, Lukas; Glembek, Ondrej; Goel, Nagendra; Hannemann, Mirko; Motlicek, Petr; Qian, Yanmin; Schwarz, Petr; Silovsky, Jan; Stemmer, Georg; Vesely, Karel

Povey, Daniel; Ghoshal, Arnab; Boulianne, Gilles; Burget, Lukas; Glembek, Ondrej; Goel, Nagendra; Hannemann, Mirko; Motlicek, Petr; Qian, Yanmin; Schwarz, Petr; Silovsky, Jan; Stemmer, Georg; Vesely, Karel

2011

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed documentation and scripts for building complete recognition systems. Kaldi is written is C++, and the core library supports modeling of arbitrary phonetic-context sizes, acoustic modeling with subspace Gaussian mixture models (SGMM) as well as standard Gaussian mixture models, together with all commonly used linear and affine transforms. Kaldi is released under the Apache License v2.0, which is highly nonrestrictive, making it suitable for a wide community of users.

Details

Title The Kaldi Speech Recognition Toolkit

Author(s) Povey, Daniel ; Ghoshal, Arnab ; Boulianne, Gilles ; Burget, Lukas ; Glembek, Ondrej ; Goel, Nagendra ; Hannemann, Mirko ; Motlicek, Petr ; Qian, Yanmin ; Schwarz, Petr ; Silovsky, Jan ; Stemmer, Georg ; Vesely, Karel

Conference IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village, Big Island, Hawaii, US

Date 2011

Publisher IEEE Signal Processing Society

Keywords

ASR; Automatic Speech Recognition; GMM; HTK; SGMM

Note IEEE Catalog No.: CFP11SRW-USB

Additional link Related documents

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL

Record creation date 2013-12-19

Actions

Preview

Select file: