3D形状从视觉和触摸重建

论文标题

3D形状从视觉和触摸重建

3D Shape Reconstruction from Vision and Touch

论文作者

Smith, Edward J., Calandra, Roberto, Romero, Adriana, Gkioxari, Georgia, Meger, David, Malik, Jitendra, Drozdzal, Michal

论文摘要

当幼儿出现一个新玩具时，他们本能的行为是将其挑选起来，并用手和眼睛串联检查一下，显然在其表面上搜索以正确理解他们正在玩的东西。在此处的任何情况下，Touch都提供高保真局部信息，而Vision提供了互补的全球环境。但是，在3D形状的重建中，视觉和触觉方式的互补融合在很大程度上尚未探索。在本文中，我们研究了这个问题，并提出了一种有效的基于图表的多模式形状理解的方法，从而鼓励了类似的融合视觉和触摸信息。为此，我们介绍了一个模拟触摸和视觉信号的数据集，该数据集是从机器人手和大型3D对象之间的相互作用中引入的。我们的结果表明，（1）利用视觉和触摸信号一致地改善了单模式基线；（2）我们的方法优于替代方式融合方法，并从提出的基于图表的结构中受益匪浅；（3）随之而来的graSps数量提高了施工质量；（4）触摸信息不仅可以增强接触地点的重建，而且还可以推断到当地社区。

When a toddler is presented a new toy, their instinctual behaviour is to pick it upand inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with. At any instance here, touch provides high fidelity localized information while vision provides complementary global context. However, in 3D shape reconstruction, the complementary fusion of visual and haptic modalities remains largely unexplored. In this paper, we study this problem and present an effective chart-based approach to multi-modal shape understanding which encourages a similar fusion vision and touch information.To do so, we introduce a dataset of simulated touch and vision signals from the interaction between a robotic hand and a large array of 3D objects. Our results show that (1) leveraging both vision and touch signals consistently improves single-modality baselines; (2) our approach outperforms alternative modality fusion methods and strongly benefits from the proposed chart-based structure; (3) there construction quality increases with the number of grasps provided; and (4) the touch information not only enhances the reconstruction at the touch site but also extrapolates to its local neighborhood.

下载PDF全文

下载文献需遵守相关版权规定

论文标题