MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation

We tackle the fundamentally ill-posed problem of 3D human localization from monocular RGB images. Driven by the limitation of neural networks outputting point estimates, we address the ambiguity in the task with a new neural network predicting confidence intervals through a loss function based on the Laplace distribution. Our architecture is a light-weight feed-forward neural network which predicts the 3D coordinates given 2D human pose. The design is particularly well suited for small training data and cross-dataset generalization. Our experiments show that (i) we outperform state-of-the art results on KITTI and nuScenes datasets, (ii) even outperform stereo for far-away pedestrians, and (iii) estimate meaningful confidence intervals. We further share insights on our model of uncertainty in case of limited observation and out-of-distribution samples.


Presented at:
ArXiv
Year:
Jun 07 2019
Keywords:
Laboratories:




 Record created 2019-06-07, last modified 2019-06-15

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)