Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Reports, Documentation, and Standards
  4. Idiap Abstract Text Summarization System for German Text Summarization Task
 
report

Idiap Abstract Text Summarization System for German Text Summarization Task

Parida, Shantipriya
•
Motlicek, Petr
2020

Text summarization is considered as a challenging task in the NLP community. The availability of datasets for the task of multilingual text summarization is rare, and such datasets are difficult to construct. In this work, we build an abstract text summarizer for the German language text using the state-of-the-art “Transformer” model. We propose an iterative data augmentation approach which uses synthetic data along with the real summarization data for the German language. To generate synthetic data, the Common Crawl (German) dataset is exploited, which covers different domains. The synthetic data is effective for the low resource conditions, and is particularly helpful for multilingual scenario where availability of summarizing data is still a challenging issue.

  • Details
  • Metrics
Type
report
Author(s)
Parida, Shantipriya
Motlicek, Petr
Date Issued

2020

Publisher

Idiap

URL
http://publications.idiap.ch/downloads/reports/2019/Parida_Idiap-RR-03-2020.pdf
Written at

EPFL

EPFL units
LIDIAP  
Available on Infoscience
February 18, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/166336
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés