Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers
 
research article

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Barman, Raphaël  
•
Ehrmann, Maud  
•
Clematide, Simon
Show more
2021
Journal of Data Mining & Digital Humanities

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. Although the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of more fine-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. We introduce a multimodal neural model for the semantic segmentation of historical newspapers that directly combines visual features at pixel level with text embedding maps derived from, potentially noisy, OCR output. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to the wide variety of our material.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

Barman et al. - 2021 - Combining Visual and Textual Features for Semantic.pdf

Type

Publisher's Version

Version

http://purl.org/coar/version/c_970fb48d4fbd8a85

Access type

openaccess

License Condition

CC BY

Size

5.31 MB

Format

Adobe PDF

Checksum (MD5)

c728b1c40933ee8e9d3dd718243dec14

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés