Assessing the effectiveness of slides as a mean to improve the automatic transcription of oral presentations

Peregoudov, Artem; Vinciarelli, Alessandro; Bourlard, Hervé

report

Peregoudov, Artem

•

Vinciarelli, Alessandro

•

Bourlard, Hervé

2006

This paper presents experiments aiming at improving the automatic transcription of oral presentations through the inclusion of the slides in the recognition process. The experiments are performed over a data set of around three hours of material (~33 kwords and 270 slides) and are based on an approach trying to maximize the similarity between the recognizer output and the content of the slides. The results show that the upper bound to the Word Error Rate (WER) reduction is 1.7% (obtained by transcribing correctly all words co-occurring in both slides and speech), but that our approach does not produce statistically significant improvements. Results analysis seems to suggest that such results do not depend on the similarity maximization approach, but on the statistical characteristics of the language.

Name

vinciarelli-idiap-rr-06-56.pdf

Access type

openaccess

Size

112.36 KB

Format

Adobe PDF

Checksum (MD5)

0e9614252c4303ff738908472a7478a7