Exploiting Long-Term Connectivity and Visual Motion in CRF-based Multi-Person Tracking
We present a conditional random field approach to tracking-by-detection in which we model pairwise factors linking pairs of detections and their hidden labels, as well as higher order potentials defined in terms of label costs. To the contrary of previous papers, our method considers long-term connectivity between pairs of detections and models similarities as well as dissimilarities between them, based on position, color, and as novelty, visual motion cues. We introduce a set of feature-specific confidence scores, which aim at weighting feature contributions according to their reliability. Pairwise potential parameters are then learned in an unsupervised way from detections or from tracklets. Label costs are defined so as to penalize the complexity of the labeling, based on prior knowledge about the scene like the location of entry/exit zones. Experiments on PETS'09, TUD, CAVIAR, Parking Lot, and Town Center public data sets show the validity of our approach, and similar or better performance than recent state-of-the-art algorithms.