Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation
 
conference paper

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

Zhao, Wei
•
Glavas, Goran
•
Peyrard, Maxime  
Show more
January 1, 2020
58Th Annual Meeting Of The Association For Computational Linguistics (Acl 2020)
58th Annual Meeting of the Association-for-Computational-Linguistics (ACL)

Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual transfer in supervised downstream tasks or via unsupervised cross-lingual textual similarity. In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversarial setup for multilingual encoders. Reference-free evaluation holds the promise of web-scale comparison of MT systems. We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER. We find that they perform poorly as semantic encoders for reference-free MT evaluation and identify their two key limitations, namely, (a) a semantic mismatch between representations of mutual translations and, more prominently, (b) the inability to punish "translationese", i.e., low-quality literal translations. We propose two partial remedies: (1) post-hoc re-alignment of the vector spaces and (2) coupling of semantic-similarity based metrics with target-side language modeling. In segment-level MT evaluation, our best metric surpasses reference-based BLEU by 5.7 correlation points. We make our MT evaluation code available.

  • Details
  • Metrics
Type
conference paper
DOI
10.18653/v1/2020.acl-main.151
Web of Science ID

WOS:000570978201086

Author(s)
Zhao, Wei
Glavas, Goran
Peyrard, Maxime  
Gao, Yang
West, Robert  
Eger, Steffen
Date Issued

2020-01-01

Publisher

ASSOC COMPUTATIONAL LINGUISTICS-ACL

Publisher place

Stroudsburg

Published in
58Th Annual Meeting Of The Association For Computational Linguistics (Acl 2020)
ISBN of the book

978-1-952148-25-5

Start page

1656

End page

1671

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DLAB  
Event nameEvent placeEvent date
58th Annual Meeting of the Association-for-Computational-Linguistics (ACL)

ELECTR NETWORK

Jul 05-10, 2020

Available on Infoscience
October 15, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/172491
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés