Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Text Recognition in Natural Images Using Multiclass Hough Forests
 
conference paper

Text Recognition in Natural Images Using Multiclass Hough Forests

Yildirim, Gökhan  
•
Achanta, Radhakrishna  
•
Süsstrunk, Sabine  
2013
Proceedings of the 8th International Conference on Computer Vision Theory and Applications
8th International Conference on Computer Vision Theory and Applications (VISAPP)

Text detection and recognition in natural images are popular yet unsolved problems in computer vision. In this paper, we propose a technique that attempts to detect and recognize text in a unified manner by searching for words directly without reducing the image into text regions or individual characters. We present three contributions. First, we modify an object detection framework called Hough Forests (Gall et al., 2011) by introducing "Cross-Scale Binary Features" that compares the information between the same image patch at different scales. We use this modified technique to produce likelihood maps for every text character. Second, our word-formation cost function and computed likelihood maps are used to detect and recognize the text in natural images. We test our technique with the Street View House Numbers (Netzer et al., 2011) and the ICDAR 2003 (Lucas et al., 2003) datasets. For the SVHN dataset, our algorithm outperforms recent methods and has comparable performance using fewer training samples. We also exceed the state-of-the-art word recognition performance for ICDAR 2003 dataset by 4%. Our final contribution is a realistic dataset generation code for text characters.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

TEXT RECOGNITION IN NATURAL IMAGES USING MULTICLASS HOUGH FORESTS.pdf

Type

Publisher's Version

Version

Published version

Access type

openaccess

Size

4.02 MB

Format

Adobe PDF

Checksum (MD5)

2b8ae23421d2a16c018e01706e8d4096

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés