From Research to Reality: Evaluation of a Single-Computer Real-Time LVCSR System for Speech-Based Retrieval
This paper presents a series of tests that were performed on a state-of-the-art real-time automatic speech recognition system for English, in a single-computer implementation. As the intention is to use the system for speech-based query-free document retrieval in conversations, several parameters were varied: text type, microphone quality, computing power, speaker fluency, and pace of the speech. Word accuracy over various word counts, including a restriction to content words, varied in the 30%-70% range. The paper compares results over many conditions, and concludes that the ASR system is acceptable for the intended use only if all the parameters are in optimal conditions. If more than two parameters are suboptimal, then its output becomes too noisy for document retrieval.
Popescu-Belis_Idiap-RR-12-2017.pdf
openaccess
535.38 KB
Adobe PDF
7149d8374471f7f3bdcdc6e0c6dbe89f