Transcribing meetings with the AMIDA systems

Hain, Thomas; Burget, Lukas; Dines, John; Garner, Philip N.; Grezl, Frantisek; El Hannani, Asmaa; Huijbregts, Marijn; Karafiat, Martin; Lincoln, Mike; Wan, Vincent

doi:10.1109/TASL.2011.2163395

Hain, Thomas; Burget, Lukas; Dines, John; Garner, Philip N.; Grezl, Frantisek; El Hannani, Asmaa; Huijbregts, Marijn; Karafiat, Martin; Lincoln, Mike; Wan, Vincent

2012

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this paper we give an overview of the AMIDA systems for transcription of conference and lecture room meetings. The systems were developed for participation in the Rich Transcription evaluations conducted by the National Institute for Standards and Technology in the years 2007 and 2009 and can process close talking and far field microphone recordings. The paper first discusses fundamental properties of meeting data with special focus on the AMI/AMIDA corpora. This is followed by a description and analysis of improved processing and modelling, with focus on techniques specifically addressing meeting transcription issues such as multi-room recordings or domain variability. In 2007 and 2009 two different strategies of systems building were followed. While in 2007 we used our traditional style system design based on cross adaptation, the 2009 systems were constructed semi-automatically, supported by improved decoders and a new method for system representation. Overall these changes gave a 6-13% relative reduction in word error rate compared to our 2007 results while at the same time requiring less training material and reducing the real-time factor by five times. The meeting transcription systems are available at www.webasr.org.

Details

Title Transcribing meetings with the AMIDA systems

Author(s) Hain, Thomas ; Burget, Lukas ; Dines, John ; Garner, Philip N. ; Grezl, Frantisek ; El Hannani, Asmaa ; Huijbregts, Marijn ; Karafiat, Martin ; Lincoln, Mike ; Wan, Vincent

Published in IEEE Transactions on Audio, Speech, and Language Processing

Volume 20

Issue 2

Pages 486-498

Date 2012

DOI https://doi.org/10.1109/TASL.2011.2163395

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Work produced at EPFL
Journal Articles
Published

Record creation date 2013-12-19

Files

Abstract

Details

PDF