Files

Abstract

The evaluation of errors made by Machine Translation (MT) systems still needs human effort despite the fact that there are automated MT evaluation tools, such as the BLEU metric. Moreover, assuming that there would be tools that support humans in this translation quality checking task, for example by automatically marking some errors found in the MT system output, there is no guarantee that this actually helps to achieve a more correct or faster human evaluation. The paper presents a user study which found statistically significant interaction effects for the task of finding MT errors under the conditions of non-annotated and automatically pre-annotated errors, in terms of the time needed to complete the task and the number of correctly found errors.

Details

Actions

Preview