Approaches to automatic lexicon learning with limited training examples

Goel, Nagendra; Thomas, Samuel; Agarwal, Mohit; Akyazi, Pinar; Burget, Lukas; Feng, Kai; Ghoshal, Arnab; Glembek, Ondrej; Karafiat, Martin; Povey, Daniel; Rastrow, Ariya; Rose, Richard C.; Schwarz, Petr

doi:10.1109/ICASSP.2010.5495037

conference paper

Approaches to automatic lexicon learning with limited training examples

Goel, Nagendra

•

Thomas, Samuel

•

Agarwal, Mohit

more

2010

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Preparation of a lexicon for speech recognition systems can be a significant effort in languages where the written form is not exactly phonetic. On the other hand, in languages where the written form is quite phonetic, some common words are often mispronounced. In this paper, we use a combination of lexicon learning techniques to explore whether a lexicon can be learned when only a small lexicon is available for boot-strapping. We discover that for a phonetic language such as Spanish, it is possible to do that better than what is possible from generic rules or hand-crafted pronunciations. For a more complex language such as English, we find that it is still possible but with some loss of accuracy.

Type

conference paper

DOI

10.1109/ICASSP.2010.5495037

Authors

Goel, Nagendra

•

Thomas, Samuel

•

Agarwal, Mohit

•

Akyazi, Pinar

•

Burget, Lukas

•

Feng, Kai

•

Ghoshal, Arnab

•

Glembek, Ondrej

•

Karafiat, Martin

•

Povey, Daniel

more

Publication date

2010

Publisher

IEEE

Published in

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Start page

5094

End page

5097

Subjects

Lexicon Learning

LVCSR

Peer reviewed

REVIEWED

EPFL units

IEL

Event name	Event place	Event date
2010 IEEE International Conference on Acoustics, Speech and Signal Processing	Dallas, TX, USA	14-19 03 2010

Available on Infoscience

November 19, 2014

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/108952