Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. PARSINLU: A Suite of Language Understanding Challenges for Persian
 
research article

PARSINLU: A Suite of Language Understanding Challenges for Persian

Khashabi, Daniel
•
Cohan, Arman
•
Shakeri, Siamak
Show more
January 1, 2021
Transactions Of The Association For Computational Linguistics

Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce PARSINLU, the first benchmark in Persian language that includes a range of language understanding tasks-reading comprehension, textual entailment, and so on. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5k new instances across 6 distinct NLU tasks. Additionally, we present the first results on state-of-the-art monolingual and multilingual pre-trained language models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope PARSINLU fosters further research and advances in Persian language understanding.(1)

  • Files
  • Details
  • Metrics
Type
research article
DOI
10.1162/tacl_a_00419
Web of Science ID

WOS:000751952200068

Author(s)
Khashabi, Daniel
Cohan, Arman
Shakeri, Siamak
Hosseini, Pedram
Pezeshkpour, Pouya
Alikhani, Malihe
Aminnaseri, Moin
Bitaab, Marzieh
Brahman, Faeze
Ghazarian, Sarik
Show more
Date Issued

2021-01-01

Published in
Transactions Of The Association For Computational Linguistics
Volume

9

Start page

1147

End page

1162

Subjects

Computer Science, Artificial Intelligence

•

Linguistics

•

Language & Linguistics

•

Computer Science

•

Linguistics

•

agreement

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

Available on Infoscience
March 14, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/186415
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés