Abstract

We present a fast method to detect humans from stationary surveillance videos. It is based on a cascade of LogitBoost classifiers which use covariance matrices as object descriptors. We have made several contributions. First, our method learns the correlation between appearance and foreground features and show that the human shape information contained in foreground observations can dramatically improve performance when used jointly with appearance cues. This contrasts with traditional approaches that exploit background subtraction as an attentive filter, by applying still image detectors only on foreground regions. As a second contribution, we show that using the covariance matrices of feature subsets rather than of the full set in boosting provides similar or better performance while significantly reducing the computation load. The last contribution is a simple image rectification scheme that removes the slant of people in images when dealing with wide angle cameras, allowing for the appropriate use of integral images. Extensive experiments on a large video set show that our approach performs much better than the attentive filter paradigm while processing 5-20 frames/s. The efficiency of our subset approach with state-of-the-art results is also demonstrated on the INRIA human (static image) database. (C) 2011 Elsevier Inc. All rights reserved.

Details

Actions