Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Comparing human and machine performances in transcribing 18th century handwritten Venetian script
 
conference paper not in proceedings

Comparing human and machine performances in transcribing 18th century handwritten Venetian script

Ares Oliveira, Sofia  
•
Kaplan, Frédéric  
July 26, 2018
Digital Humanities Conference

Automatic transcription of handwritten texts has made important progress in the recent years. This increase in performance, essentially due to new architectures combining convolutional neural networks with recurrent neutral networks, opens new avenues for searching in large databases of archival and library records. This paper reports on our recent progress in making million digitized Venetian documents searchable, focusing on a first subset of 18th century fiscal documents from the Venetian State Archives. For this study, about 23’000 image segments containing 55’000 Venetian names of persons and places were manually transcribed by archivists, trained to read such kind of handwritten script. This annotated dataset was used to train and test a deep learning architecture with a performance level (about 10% character error rate) that is satisfactory for search use cases. This paper compares this level of reading performance with the reading capabilities of Italian-speaking transcribers. More than 8500 new human transcriptions were produced, confirming that the amateur transcribers were not as good as the expert. However, on average, the machine outperforms the amateur transcribers in this transcription tasks.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

final_abstract_dh18.pdf

Type

Publisher's Version

Version

http://purl.org/coar/version/c_970fb48d4fbd8a85

Access type

openaccess

License Condition

CC BY

Size

4.19 MB

Format

Adobe PDF

Checksum (MD5)

af3bedb917053dbfe3cbb0bb9e5b222f

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés