Text Recognition in Natural Images Using Multiclass Hough Forests

Yildirim, Gökhan; Achanta, Radhakrishna; Süsstrunk, Sabine

Yildirim, Gökhan; Achanta, Radhakrishna; Süsstrunk, Sabine

2013

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Text detection and recognition in natural images are popular yet unsolved problems in computer vision. In this paper, we propose a technique that attempts to detect and recognize text in a unified manner by searching for words directly without reducing the image into text regions or individual characters. We present three contributions. First, we modify an object detection framework called Hough Forests (Gall et al., 2011) by introducing "Cross-Scale Binary Features" that compares the information between the same image patch at different scales. We use this modified technique to produce likelihood maps for every text character. Second, our word-formation cost function and computed likelihood maps are used to detect and recognize the text in natural images. We test our technique with the Street View House Numbers (Netzer et al., 2011) and the ICDAR 2003 (Lucas et al., 2003) datasets. For the SVHN dataset, our algorithm outperforms recent methods and has comparable performance using fewer training samples. We also exceed the state-of-the-art word recognition performance for ICDAR 2003 dataset by 4%. Our final contribution is a realistic dataset generation code for text characters.

Details

Title Text Recognition in Natural Images Using Multiclass Hough Forests

Author(s) Yildirim, Gökhan ; Achanta, Radhakrishna ; Süsstrunk, Sabine

Published in Proceedings of the 8th International Conference on Computer Vision Theory and Applications

Volume 1

Pages 737-741

Conference 8th International Conference on Computer Vision Theory and Applications (VISAPP), Barcelona, Spain, February 21-24, 2013

Date 2013

ISBN 978-989-8565-47-1

Keywords

Text detection and recognition; Hough forests; Feature selection; Natural images

Additional link URL

Laboratories IVRL

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > IVRL - Image and Visual Representation Laboratory
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2013-03-05

Files

Abstract

Details

PDF