Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages

Karouimu, Yasmine; Lebret, Remi; Foroutan, Negar; Aberer, Karl

conference paper

Karouimu, Yasmine

•

Lebret, Remi

•

Foroutan, Negar

Boyd-Graber, J

•

Okazaki, N

January 1, 2023

61St Conference Of The The Association For Computational Linguistics, Acl 2023, Vol 2

61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL)

Vision-Language Pre-training (VLP) has advanced the performance of many visionlanguage tasks, such as image-text retrieval, visual entailment, and visual reasoning. The pre-training mostly utilizes lexical databases and image queries in English. Previous work has demonstrated that the pre-training in English does not transfer well to other languages in a zero-shot setting. However, multilingual pre-trained language models (MPLM) have excelled at a variety of single-modal language tasks. In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. We utilize a cross-lingual contextualized token embeddings alignment approach to train text encoders for non-English languages. Our approach does not require image input and primarily uses machine translation, eliminating the need for target language data. Our evaluation across three distinct tasks (image-text retrieval, visual entailment, and natural language visual reasoning) demonstrates that this approach outperforms the state-of-the-art multilingual vision-language models without requiring large parallel corpora. Our code is available at https://github.com/Yasminekaroui/CliCoTea.

Type

conference paper

Web of Science ID

WOS:001181088800032

Author(s)

Karouimu, Yasmine

Lebret, Remi

Foroutan, Negar

Aberer, Karl

Editors

Boyd-Graber, J

•

Okazaki, N

•

Rogers, A

Date Issued

2023-01-01

Publisher

Assoc Computational Linguistics-Acl

Publisher place

Stroudsburg

Published in

61St Conference Of The The Association For Computational Linguistics, Acl 2023, Vol 2

ISBN of the book

978-1-959429-71-5

Start page

366

End page

375

Subjects

Technology

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

LSIR

Event name	Event place	Event date
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL)	Toronto, CANADA	JUL 09-14, 2023

Available on Infoscience

May 1, 2024

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/207606