We present an approach to learn dense, continuous 2D-3D correspondence distributions over the surface of objects from data with no prior knowledge of visual ambiguities like symmetry. We also present a new method for 6D pose estimation of rigid objects using the learnt distributions to sample, score and refine pose hypotheses. The correspondence distributions are learnt with a contrastive loss, represented in object-specific latent spaces by an encoder-decoder query model and a small fully connected key model. Our method is unsupervised with respect to visual ambiguities, yet we show that the query- and key models learn to represent accurate multi-modal surface distributions. Our pose estimation method improves the state-of-the-art significantly on the comprehensive BOP Challenge, trained purely on synthetic data, even compared with methods trained on real data. The project site is at surfemb.github.io.
|Title of host publication||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)|
|Number of pages||10|
|Publication status||Published - 2022|
|Event||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) - New Orleans, United States|
Duration: 18. Jun 2022 → 24. Jun 2022
|Conference||2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)|
|Period||18/06/2022 → 24/06/2022|
|Series||IEEE Conference on Computer Vision and Pattern Recognition. Proceedings|
Bibliographical notePublisher Copyright:
© 2022 IEEE.
- Machine learning
- Pose estimation and tracking
- Representation learning