Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Named Entity Recognition and Classification in Historical Documents: A Survey
 
research article

Named Entity Recognition and Classification in Historical Documents: A Survey

Ehrmann, Maud  
•
Hamdi, Ahmed
•
Linhares Pontes, Elvys
Show more
September 21, 2021
ACM Computing Survey

After decades of massive digitisation, an unprecedented amount of historical documents is available in digital format, along with their machine-readable texts. While this represents a major step forward with respect to preservation and accessibility, it also opens up new opportunities in terms of content mining and the next fundamental challenge is to develop appropriate technologies to efficiently search, retrieve and explore information from this 'big data of the past'. Among semantic indexing opportunities, the recognition and classification of named entities are in great demand among humanities scholars. Yet, named entity recognition (NER) systems are heavily challenged with diverse, historical and noisy inputs. In this survey, we present the array of challenges posed by historical documents to NER, inventory existing resources, describe the main approaches deployed so far, and identify key priorities for future developments.

  • Files
  • Details
  • Metrics
Type
research article
Web of Science ID

WOS:001085637600002

ArXiv ID

2109.11406

Author(s)
Ehrmann, Maud  
Hamdi, Ahmed
Linhares Pontes, Elvys
Romanello, Matteo
Doucet, Antoine
Date Issued

2021-09-21

Published in
ACM Computing Survey
Volume

56

Issue

2

Start page

27

Subjects

named entity recognition and classification

•

historical documents

•

natural language processing

•

digital humanities

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DHLAB  
Available on Infoscience
October 23, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/191499
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés