Files

Abstract

The sliding window approach is the most widely used technique to detect objects from an image. In the past few years, classifiers have been improved in many ways to increase the scanning speed. Apart from the classifier design (such as the cascade), the scanning speed also depends on a number of different factors (such as grid spacing, and scale at which the image is searched). Scanning grid spacing controls the number of subwindows being processed, thus controlling the speed of detection. When the scanning grid spacing is larger than the tolerance of the trained classifier it can suffer from low detections. In this thesis, we propose an alternative search technique, which can improve the detections when lesser number of subwindows are processed. First, we present a technique to reduce the number of miss detections while increasing the grid spacing when using the sliding window approach for object detection. This is achieved by using a small patch to predict the location of an object within a local search area. To achieve speed, it is necessary that the time taken for location prediction is comparable or better than the time it takes in average for the object classifier to reject a subwindow. We use binary features and a decision tree as it proved to be efficient for our application. In the process we also propose a variation of an existing binary feature (Ferns) with similar performance, and requires only half the number of pixel access when compared to Fern feature. We analyze the effect of patch size on location estimation and also evaluate our approach on several face databases. Experimental evaluation shows better detection rate and speed with our proposed approach for larger grid spacing (lesser number of subwindows) when compared to standard scanning technique. We also show that by using a simple interest point detector based on quantized gradient orientation, as the front-end to the proposed location estimation technique, we can achieve better performance even when fewer number of subwindows are processed. The interest points detected can be assumed as a non-regular grid compared to regular grid in the sliding window framework. A few image patches are sampled around an interest point for estimating the probable face location and further verified using a strong face classifier. Experiment results show that using an interest point detector can reduce the number of subwindows processed while maintaining a good detection rate.

Details

Actions