Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
 
conference paper

CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks

Ismayilzada, Mete  
•
Paul, Debjit  
•
Montariol, Syrielle  
Show more
Bouamor, Houda
•
Pino, Juan
Show more
2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
The 2023 Conference on Empirical Methods in Natural Language Processing

Recent efforts in natural language processing (NLP) commonsense reasoning research have yielded a considerable number of new datasets and benchmarks. However, most of these datasets formulate commonsense reasoning challenges in artificial scenarios that are not reflective of the tasks which real-world NLP systems are designed to solve. In this work, we present CRoW, a manually-curated, multi-task benchmark that evaluates the ability of models to apply commonsense reasoning in the context of six real-world NLP tasks. CRoW is constructed using a multi-stage data collection pipeline that rewrites examples from existing datasets using commonsense-violating perturbations. We use CRoW to study how NLP systems perform across different dimensions of commonsense knowledge, such as physical, temporal, and social reasoning. We find a significant performance gap when NLP systems are evaluated on CRoW compared to humans, showcasing that commonsense reasoning is far from being solved in real-world task settings. We make our dataset and leaderboard available to the research community.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.18653/v1/2023.emnlp-main.607
Author(s)
Ismayilzada, Mete  

EPFL

Paul, Debjit  

École Polytechnique Fédérale de Lausanne

Montariol, Syrielle  

EPFL

Geva, Mor

DeepMind (United Kingdom)

Bosselut, Antoine  

EPFL

Editors
Bouamor, Houda
•
Pino, Juan
•
Bali, Kalika
Date Issued

2023

Publisher

Association for Computational Linguistics (ACL)

Publisher place

Singapore

Published in
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
ISBN of the book

979-8-89176-060-8

Start page

9785

End page

9821

Subjects

commonsense reasoning

•

real-world tasks

•

benchmarking

URL

Proceedings fulltext

https://aclanthology.org/2023.emnlp-main.0.pdf
Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
NLP  
Event nameEvent acronymEvent placeEvent date
The 2023 Conference on Empirical Methods in Natural Language Processing

EMNLP 2023

Singapore

2022-12-06 - 2022-12-10

FunderFunding(s)Grant NumberGrant URL

Swiss National Science Foundation

Innosuisse – Swiss Innovation Agency

EPFL Science Seed Fund

Show more
Available on Infoscience
April 1, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/248410
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés