Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. A Large-Scale Dataset for Empathetic Response Generation
 
conference paper

A Large-Scale Dataset for Empathetic Response Generation

Welivita, Anuradha  
•
Xie, Yubo  
•
Pu, Pearl  
January 1, 2021
2021 Conference On Empirical Methods In Natural Language Processing (Emnlp 2021)
Conference on Empirical Methods in Natural Language Processing (EMNLP)

Recent development in NLP shows a strong trend towards refining pre-trained models with a domain-specific dataset. This is especially the case for response generation where emotion plays an important role. However, existing empathetic datasets remain small, delaying research efforts in this area, for example, the development of emotion-aware chatbots. One main technical challenge has been the cost of manually annotating dialogues with the right emotion labels. In this paper, we describe a large-scale silver dataset consisting of 1M dialogues annotated with 32 fine-grained emotions, eight empathetic response intents, and the Neutral category. To achieve this goal, we have developed a novel data curation pipeline starting with a small seed of manually annotated data and eventually scaling it to a satisfactory size. We compare its quality against a state-of-the-art gold dataset using offline experiments and visual validation methods. The resultant procedure can be used to create similar datasets in the same domain as well as in other domains.(1)

  • Details
  • Metrics
Type
conference paper
DOI
10.18653/v1/2021.emnlp-main.96
Web of Science ID

WOS:000855966301029

Author(s)
Welivita, Anuradha  
Xie, Yubo  
Pu, Pearl  
Date Issued

2021-01-01

Publisher

ASSOC COMPUTATIONAL LINGUISTICS-ACL

Publisher place

Stroudsburg

Published in
2021 Conference On Empirical Methods In Natural Language Processing (Emnlp 2021)
ISBN of the book

978-1-955917-09-4

Start page

1251

End page

1264

Subjects

Computer Science, Artificial Intelligence

•

Computer Science, Interdisciplinary Applications

•

Linguistics

•

Computer Science

•

Linguistics

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

Event nameEvent placeEvent date
Conference on Empirical Methods in Natural Language Processing (EMNLP)

Punta Cana, DOMINICAN REP

Nov 07-11, 2021

Available on Infoscience
November 7, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/191867
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés