Stereo Hand-Object Reconstruction for Human-to-Robot Handover
Jointly estimating hand and object shape facilitates the grasping task in human-to-robot handovers. Relying on handcrafted prior knowledge about the geometric structure of the object fails when generalising to unseen objects, and depth sensors fail to detect transparent objects such as drinking glasses. In this work, we propose a method for hand-object reconstruction that combines single-view reconstructions probabilistically to form a coherent stereo reconstruction. We learn 3D shape priors from a large synthetic hand-object dataset, and use RGB inputs to better capture transparent objects. We show that our method reduces the object Chamfer distance compared to existing RGB based hand-object reconstruction methods on single view and stereo settings. We process the reconstructed hand-object shape with a projection-based outlier removal step and use the output to guide a human-to-robot handover pipeline with wide-baseline stereo RGB cameras. Our hand-object reconstruction enables a robot to successfully receive a diverse range of household objects from the human.
2-s2.0-105003372859
Queen Mary University of London
Queen Mary University of London
Queen Mary University of London
École Polytechnique Fédérale de Lausanne
2025
REVIEWED
EPFL