Automatic Out-of-Language Detection Based on Confidence Measures Derived from LVCSR Word and Phone Lattices

Motlicek, Petr

doi:10.21437/Interspeech.2009-351

Motlicek, Petr

2009

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Confidence Measures (CMs) estimated from Large Vocabulary Continuous Speech Recognition (LVCSR) outputs are commonly used metrics to detect incorrectly recognized words. In this paper, we propose to exploit CMs derived from frame-based word and phone posteriors to detect speech segments containing pronunciations from non-target (alien) languages. The LVCSR system used is built for English, which is the target language, with medium-size recognition vocabulary (5k words). The efficiency of detection is tested on a set comprising speech from three different languages (English, German, Czech). Results achieved indicate that employment of specific temporal context (integrated in the word or phone level) significantly increases the detection accuracies. Furthermore, we show that combination of several CMs can also improve the efficiency of detection.

Details

Title Automatic Out-of-Language Detection Based on Confidence Measures Derived from LVCSR Word and Phone Lattices

Author(s) Motlicek, Petr

Published in Interspeech 2009

Series 2009 ISCA

Pages 1215-1218

Conference ISCA - 10thAnnual Conference of the International Speech Communication Association, Brighton, England

Date 2009

DOI https://doi.org/10.21437/Interspeech.2009-351

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2010-02-11

Actions

Preview

Select file: