Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Reports, Documentation, and Standards
  4. Boosting named entity recognition in domain-specific and low-resource settings
 
report

Boosting named entity recognition in domain-specific and low-resource settings

Najem-Meyer, Sven  
January 13, 2022

Recent researches in natural language processing have leveraged attention-based models to produce state-of-the-art results in a wide variety of tasks. Using transfer learning, generic models like BERT can be fine-tuned for domain-specific tasks using little annotated data. In the field of digital humanities and classics, bibliographical reference extraction counts among the domain-specific tasks where few annotated datasets have been made available. It therefore remains a highly challenging Named Entity Recognition (NER) problem which has not been addressed by the aforementioned approaches yet. In this study, we try to boost bibliographical reference extraction with various transfer learning strategies. We compare three transformers to a Conditional Random Fields (CRF) developed by Romanello, using both generic and domain-specific pre-training. Experiments show that transformers consistently improve on CRF baselines. However, domain-specific pre-training yields no significant benefits. We discuss and compare these results in light of comparable researches in domain-specific NER.

  • Files
  • Details
  • Metrics
Type
report
Author(s)
Najem-Meyer, Sven  
Date Issued

2022-01-13

Total of pages

21

Subjects

nlp

•

citation mining

•

ner

Note

This the final version of a doctoral semester project's report conducted between February and September 2021.

Editorial or Peer reviewed

NON-REVIEWED

Written at

EPFL

EPFL units
DHLAB  
Available on Infoscience
January 13, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/184441
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés