conference paper
A Handwritten French Dataset for Word Spotting - CFRAMUZ
2017
HIP2017: Proceedings of the 4th International Workshop on Historical Document Imaging and Processing
We present a new and freely available dataset, CFRAMUZ, for segmentation-free word spotting research. The dataset consists of seven novels with a total number of 64 pages and 18000 words written in french by the Swiss writer C.F. Ramuz. The novels cover the writer’s whole period of life, therefore they show changes in the handwriting style. Together with the complete ground-truth of the dataset we provide an annotation tool. We provide evaluations of state-of-the-art word spotting approaches on this dataset. For completeness we also compare all the approaches on other commonly used datasets to demonstrate the new difficulties and challenges our new dataset introduces.