Extended Overview of CLEF HIPE 2020: Named Entity Processing on Historical Newspapers

Ehrmann, Maud; Romanello, Matteo; Flückiger, Alex; Clematide, Simon

doi:10.5281/zenodo.4117566

conference paper

Extended Overview of CLEF HIPE 2020: Named Entity Processing on Historical Newspapers

Ehrmann, Maud

•

Romanello, Matteo

•

Flückiger, Alex

October 21, 2020

CLEF 2020 Working Notes. Conference and Labs of the Evaluation Forum

11th Conference and Labs of the Evaluation Forum (CLEF 2020)

This paper presents an extended overview of the first edition of HIPE (Identifying Historical People, Places and other Entities), a pioneering shared task dedicated to the evaluation of named entity processing on historical newspapers in French, German and English. Since its introduction some twenty years ago, named entity (NE) processing has become an essential component of virtually any text mining application and has undergone major changes. Recently, two main trends characterise its developments: the adoption of deep learning architectures and the consideration of textual material originating from historical and cultural heritage collections. While the former opens up new opportunities, the latter introduces new challenges with heterogeneous, historical and noisy inputs. In this context, the objective of HIPE, run as part of the CLEF 2020 conference, is threefold: strengthening the robustness of existing approaches on non-standard inputs, enabling performance comparison of NE processing on historical texts, and, in the long run, fostering efficient semantic indexing of historical documents. Tasks, corpora, and results of 13 participating teams are presented. Compared to the condensed overview [31], this paper includes further details about data generation and statistics, additional information on participating systems, and the presentation of complementary results.

Name

paper_255-2.pdf

Type

Publisher's Version

Version

Published version

Access type

openaccess

License Condition

CC BY

Size

2.32 MB

Format

Adobe PDF

Checksum (MD5)

799962dc556194aaa9f2644cdab8bb0b