Abstract

We propose a test -time adaptation for 6D object pose tracking that learns to adapt a pre -trained model to track the 6D pose of novel objects. We consider the problem of 6D object pose tracking as a 3D keypoint detection and matching task and present a model that extracts 3D keypoints. Given an RGB-D image and the mask of a target object for each frame, the proposed model consists of the selfand cross -attention modules to produce the features that aggregate the information within and across frames, respectively. By using the keypoints detected from the features for each frame, we estimate the pose changes between two frames, which enables 6D pose tracking when the 6D pose of a target object in the initial frame is given. Our model is first trained in a source domain, a category -level tracking dataset where the ground truth 6D pose of the object is available. To deploy this pre -trained model to track novel objects, we present a test -time adaptation strategy that trains the model to adapt to the target novel object by self -supervised learning. Given an RGB-D video sequence of the novel object, the proposed self -supervised losses encourage the model to estimate the 6D pose changes that can keep the photometric and geometric consistency of the object. We validate our method on the NOCS-REAL275 dataset and our collected dataset, and the results show the advantages of tracking novel objects. The collected dataset and visualisation of tracking results are available: https://qm-ipalab.github.io/TA-6DT/

Details