Towards using slide information to enhance speech transcription of meetings

Peregoudov, Artem; Vinciarelli, Alessandro; Bourlard, Hervé

report

Towards using slide information to enhance speech transcription of meetings

Peregoudov, Artem

•

Vinciarelli, Alessandro

•

Bourlard, Hervé

2006

In this paper we investigate the possibility of improving the speech recognition performance of meeting recordings by using slides captured during the recording process. The key hypothesis exploited in this work is that both slides and speech carry correlated contextual and semantic information. Thus, we propose an approach using the information extracted from slides aimed at reducing the speech recognition word error rate. The N-Best lists output by the recogniser are rescored through Information Retrieval techniques to maximise the similarity between speech and slides transcripts. Results obtained on three meeting recordings (for a total duration of about 90 minutes) show no statistically significant variation of the word error rate. Additional studies provide further insight based on both language properties and statistics of the word distributions in the two sources.

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/228786

Name

peregoud-idiap-rr-06-01.pdf

Access type

openaccess

Size

251.25 KB

Format

Adobe PDF

Checksum (MD5)

0b7452fc3bd1677397a497832e5da3f5