Motion models for robust 3D human body tracking

In this work, we propose new ways to learn pose and motion priors models and show that they can be used to increase the performance of 3D body tracking algorithms, resulting in very realistic motions under very challenging conditions. We first explored an approach to 3D people tracking that combines learned motion models and deterministic optimization. The tracking problem is formulated as the minimization of a differentiable criterion whose differential structure is rich enough for optimization to be accomplished via hill-climbing. This avoids the computational expense of Monte Carlo methods, while yielding very good results under challenging conditions. To demonstrate the generality of the approach we show that we can learn and track cyclic motions such as walking and running, as well as acyclic motions such as a golf swing. We also show results from both monocular and multi-camera tracking. Finally, we provide results with a motion model learned from multiple activities, and show how these models can be used for recognition and motion generation. The major limitation of these linear motion models is that they required many noiseless, segmented and time warped examples to create a complete database with good generalization properties. We therefore investigated more complex non-linear statistical techniques. We advocate the use of Scaled Gaussian Process Latent Variable Models (SGPLVM) to learn prior models of 3D human pose. The SGPLVM simultaneously optimizes a low-dimensional embedding of the high-dimensional pose data and a density function that both gives higher probability to points close to training data and provides a nonlinear probabilistic mapping from the low-dimensional latent space to the full-dimensional pose space. The SGPLVM is a natural choice when only small amounts of training data are available. We demonstrate our approach with two distinct motions, golfing and walking. We show that the SGPLVM sufficiently constrains the problem such that tracking can be accomplished with straightforward deterministic optimization. However, in the presence of very noisy or missing data, for example due to occlusions, the simplistic second order Markov model we use is not realistic enough to sufficiently constrain the algorithm. Moreover, when learning models that contain stylistic diversity, from different people or from the same person performing an activity multiple times, the SGPLVM results in models whose latent trajectories are not smooth, and are therefore not suited for hill climbing tracking. Finally, we present a more powerfull approach based on the Gaussian Process Dynamical Models (GPDMs) that combines the strengths of the two previous ones. We advocate the use of GPDMs for learning human pose and motion priors. A GPDM provides a low-dimensional embedding of human motion data, with a density function that gives higher probability to poses and motions close to the training data. With Bayesian model averaging a GPDM can be learned from relatively small amounts of data, and it generalizes gracefully to motions outside the training set. Here we modify the GPDM to permit learning from motions with significant stylistic variation. The resulting priors are effective for tracking a range of human walking styles, despite weak and noisy image measurements and significant occlusions.


Related material