Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages
 
conference paper

Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages

Karouimu, Yasmine
•
Lebret, Remi  
•
Foroutan, Negar  
Show more
Boyd-Graber, J
•
Okazaki, N
Show more
January 1, 2023
61St Conference Of The The Association For Computational Linguistics, Acl 2023, Vol 2
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL)

Vision-Language Pre-training (VLP) has advanced the performance of many visionlanguage tasks, such as image-text retrieval, visual entailment, and visual reasoning. The pre-training mostly utilizes lexical databases and image queries in English. Previous work has demonstrated that the pre-training in English does not transfer well to other languages in a zero-shot setting. However, multilingual pre-trained language models (MPLM) have excelled at a variety of single-modal language tasks. In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. We utilize a cross-lingual contextualized token embeddings alignment approach to train text encoders for non-English languages. Our approach does not require image input and primarily uses machine translation, eliminating the need for target language data. Our evaluation across three distinct tasks (image-text retrieval, visual entailment, and natural language visual reasoning) demonstrates that this approach outperforms the state-of-the-art multilingual vision-language models without requiring large parallel corpora. Our code is available at https://github.com/Yasminekaroui/CliCoTea.

  • Details
  • Metrics
Type
conference paper
Web of Science ID

WOS:001181088800032

Author(s)
Karouimu, Yasmine
•
Lebret, Remi  
•
Foroutan, Negar  
•
Aberer, Karl  
Editors
Boyd-Graber, J
•
Okazaki, N
•
Rogers, A
Date Issued

2023-01-01

Publisher

Assoc Computational Linguistics-Acl

Publisher place

Stroudsburg

Published in
61St Conference Of The The Association For Computational Linguistics, Acl 2023, Vol 2
ISBN of the book

978-1-959429-71-5

Start page

366

End page

375

Subjects

Technology

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSIR  
Event nameEvent placeEvent date
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL)

Toronto, CANADA

JUL 09-14, 2023

Available on Infoscience
May 1, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/207606
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés