迈向两视图6D对象姿势估计：融合策略的比较研究

论文标题

迈向两视图6D对象姿势估计：融合策略的比较研究

Towards Two-view 6D Object Pose Estimation: A Comparative Study on Fusion Strategy

论文作者

Wu, Jun, Liu, Lilu, Wang, Yue, Xiong, Rong

论文摘要

当前基于RGB的6D对象姿势估计方法在数据集和现实世界应用程序上取得了明显的性能。但是，从单个2D图像特征中预测6D姿势容易受到环境，无纹理或相似物体表面的变化的干扰。因此，基于RGB的方法通常比基于RGBD的方法获得的竞争结果较低，后者既部署图像特征和3D结构特征。为了缩小这一性能差距，本文提出了一个6D对象构成估计的框架，该框架从2个RGB图像中学习隐式3D信息。结合了学习的3D信息和2D图像功能，我们在场景和对象模型之间建立了更稳定的对应关系。为了寻求从RGB输入中使用3D信息的最佳方法，我们对三种不同的方法进行了调查，包括早期融合，中融合和晚融合。我们确定中融合方法是恢复可用于对象姿势估计的最精确的3D关键点的最佳方法。该实验表明，我们的方法优于最先进的RGB方法，并通过基于RGBD的方法获得了可比的结果。

Current RGB-based 6D object pose estimation methods have achieved noticeable performance on datasets and real world applications. However, predicting 6D pose from single 2D image features is susceptible to disturbance from changing of environment and textureless or resemblant object surfaces. Hence, RGB-based methods generally achieve less competitive results than RGBD-based methods, which deploy both image features and 3D structure features. To narrow down this performance gap, this paper proposes a framework for 6D object pose estimation that learns implicit 3D information from 2 RGB images. Combining the learned 3D information and 2D image features, we establish more stable correspondence between the scene and the object models. To seek for the methods best utilizing 3D information from RGB inputs, we conduct an investigation on three different approaches, including Early- Fusion, Mid-Fusion, and Late-Fusion. We ascertain the Mid- Fusion approach is the best approach to restore the most precise 3D keypoints useful for object pose estimation. The experiments show that our method outperforms state-of-the-art RGB-based methods, and achieves comparable results with RGBD-based methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题