论文标题
让图像给您更多:点云交叉模式训练以进行形状分析
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis
论文作者
论文摘要
尽管最近的积分云分析取得了令人印象深刻的进步,但从单一模态学习的范式逐渐符合其瓶颈。在这项工作中,我们通过完全采用固有地包含更丰富的外观信息的图像的优势,例如纹理,颜色和阴影,迈出了更具歧视性的3D点云表示。具体而言,本文介绍了一种简单但有效的点云交叉模式训练(PointCMT)策略,该策略利用了3D对象的视图图像,即渲染或投影的2D图像来增强点云分析。实际上,为了有效地从查看图像中获取辅助知识,我们开发了一个教师学生框架,并将跨模态学习作为知识蒸馏问题。 PointCMT通过新型特征和分类器增强标准消除了不同模态之间的分布差异,并有效地避免了潜在的负转移。请注意,PointCMT有效地改善了无点表示没有体系结构修改。足够的实验使用吸引人的骨干(即配备PointCMT,PointNet ++和PointMLP)在两个基准上实现最先进的性能,即94.4%和86.7%的ModelNet40和ScanObjectnn的精确度。代码将在https://github.com/zhanheshen/pointcmt上提供。
Although recent point cloud analysis achieves impressive progress, the paradigm of representation learning from a single modality gradually meets its bottleneck. In this work, we take a step towards more discriminative 3D point cloud representation by fully taking advantages of images which inherently contain richer appearance information, e.g., texture, color, and shade. Specifically, this paper introduces a simple but effective point cloud cross-modality training (PointCMT) strategy, which utilizes view-images, i.e., rendered or projected 2D images of the 3D object, to boost point cloud analysis. In practice, to effectively acquire auxiliary knowledge from view images, we develop a teacher-student framework and formulate the cross modal learning as a knowledge distillation problem. PointCMT eliminates the distribution discrepancy between different modalities through novel feature and classifier enhancement criteria and avoids potential negative transfer effectively. Note that PointCMT effectively improves the point-only representation without architecture modification. Sufficient experiments verify significant gains on various datasets using appealing backbones, i.e., equipped with PointCMT, PointNet++ and PointMLP achieve state-of-the-art performance on two benchmarks, i.e., 94.4% and 86.7% accuracy on ModelNet40 and ScanObjectNN, respectively. Code will be made available at https://github.com/ZhanHeshen/PointCMT.