Reference-based vs. task-based evaluation of human language technology

Popescu-Belis, Andrei

Popescu-Belis, Andrei

2008

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This paper starts from the ISO distinction of three types of evaluation procedures – internal, external and in use – and proposes to match these types to the three types of human language technology (HLT) systems: analysis, generation, and interactive. The paper explains why internal evaluation is not suitable to measure the qualities of HLT systems, and shows that reference-based external evaluation is best adapted to ‘analysis’ systems, task-based evaluation to ‘interactive’ systems, while ‘generation’ systems can be subject to both types of evaluation. In particular, some limits of reference-based external evaluation are shown in the case of generation systems. Finally, the paper shows that contextual evaluation, as illustrated by the FEMTI framework for MT evaluation, is an effective method for getting reference-based evaluation closer to the users of a system.

Details

Title Reference-based vs. task-based evaluation of human language technology

Author(s) Popescu-Belis, Andrei

Conference ELRA - LREC 2008 ELRA Workshop on Evaluation, Marrakech, Morocco

Date 2008

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Conference Papers
Work produced at EPFL
Published

Record creation date 2010-02-11

Files

Abstract

Details

PDF