CRAB: Assessing the Strength of Causal Relationships Between Real-World Events
Understanding narratives requires reasoning about the cause-and-effect relationships between events mentioned in the text. While existing foundation models yield impressive results in many NLP tasks requiring reasoning, it is unclear whether they understand the complexity of the underlying network of causal relationships of events in narratives. In this work, we present CRAB, a new Causal Reasoning Assessment Benchmark designed to evaluate causal understanding of events in real-world narratives. CRAB contains fine-grained, contextual causality annotations for ∼ 2.7K pairs of real-world events that describe various newsworthy event timelines (e.g., the acquisition of Twitter by Elon Musk). Using CRAB, we measure the performance of several large language models, demonstrating that most systems achieve poor performance on the task. Motivated by classical causal principles, we also analyze the causal structures of groups of events in CRAB, and find that models perform worse on causal reasoning when events are derived from complex causal structures compared to simple linear causal chains.
CRAB__Assessing_the_Strength_of_Causal_Relationships_Between_Real_world_Events_camera.pdf
Video
http://purl.org/coar/version/c_970fb48d4fbd8a85
openaccess
CC BY
2.81 MB
Adobe PDF
9397e065e8202cfab313181dfb30b3f8
2023.emnlp-main.940.mp4
Video
http://purl.org/coar/version/c_be7fb7dd8ff6fe43
openaccess
N/A
47.4 MB
MP4
e1f2c9ee77d14501eaa5696a2889fcee