6DOF对象姿势估计的空间特征映射

论文标题

6DOF对象姿势估计的空间特征映射

Spatial Feature Mapping for 6DoF Object Pose Estimation

论文作者

Mei, Jianhan, Jiang, Xudong, Ding, Henghui

论文摘要

这项工作旨在估计6DOF（6D）对象在背景混乱中姿势。考虑到强烈的阻塞和背景噪声，我们建议利用空间结构来更好地应对这项具有挑战性的任务。观察3D网格可以通过图自然抽象，我们使用3D点作为顶点和网格连接作为边缘构建图形。我们构建了从2D图像特征到3D点的相应映射，以填充2D和3D特征的图形和融合。之后，应用图形卷积网络（GCN）来帮助3D空间中对象点之间的特征交换。为了解决对象旋转对称性歧义的问题，使用了球形卷积，并将球形特征与映射到图的卷积特征结合在一起。对预定义的3D关键点进行投票，并通过拟合优化获得6DOF姿势。两种推理的情况，一种具有深度信息，另一个没有讨论。在YCB-VIDEO和LineMod的数据集上进行了测试，该实验证明了我们提出的方法的有效性。

This work aims to estimate 6Dof (6D) object pose in background clutter. Considering the strong occlusion and background noise, we propose to utilize the spatial structure for better tackling this challenging task. Observing that the 3D mesh can be naturally abstracted by a graph, we build the graph using 3D points as vertices and mesh connections as edges. We construct the corresponding mapping from 2D image features to 3D points for filling the graph and fusion of the 2D and 3D features. Afterward, a Graph Convolutional Network (GCN) is applied to help the feature exchange among objects' points in 3D space. To address the problem of rotation symmetry ambiguity for objects, a spherical convolution is utilized and the spherical features are combined with the convolutional features that are mapped to the graph. Predefined 3D keypoints are voted and the 6DoF pose is obtained via the fitting optimization. Two scenarios of inference, one with the depth information and the other without it are discussed. Tested on the datasets of YCB-Video and LINEMOD, the experiments demonstrate the effectiveness of our proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题