Real-Time Seamless Single Shot 6D Object Pose Prediction

We propose a single-shot approach for simultaneously detecting an object in an RGB image and predicting its 6D pose without requiring multiple stages or having to examine multiple hypotheses. Unlike a recently proposed single-shot technique for this task [10] that only predicts an approximate 6D pose that must then be refined, ours is accurate enough not to require additional post-processing. As a result, it is much faster 50 fps on a Titan X (Pascal) GPU and more suitable for real-time processing. The key component of our method is a new CNN architecture inspired by, [27, 28]that directly predicts the 2D image locations of the projected vertices of the object's 3D bounding box. The object's 6D pose is then estimated using a PnP algorithm.
For single object and multiple object pose estimation on the LINEMOD and OCCLUSION datasets, our approach substantially outperforms other recent CNN-based approaches [10, 25] when they are all used without post processing. During post-processing, a pose refinement step can be used to boost the accuracy of these two methods, but at 10 fps or less, they are much slower than our method.

Published in:
2018 Ieee/Cvf Conference On Computer Vision And Pattern Recognition (Cvpr), 292-301
Presented at:
31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, Jun 18-23, 2018
Jan 01 2018
New York, IEEE

 Record created 2019-06-18, last modified 2019-12-05

Rate this document:

Rate this document:
(Not yet reviewed)