伪优惠：学会使用未标记的数据进行点云中的数据增强

论文标题

伪优惠：学会使用未标记的数据进行点云中的数据增强

PseudoAugment: Learning to Use Unlabeled Data for Data Augmentation in Point Clouds

论文作者

Leng, Zhaoqi, Cheng, Shuyang, Caine, Benjamin, Wang, Weiyue, Zhang, Xiao, Shlens, Jonathon, Tan, Mingxing, Anguelov, Dragomir

论文摘要

数据增强是提高数据效率并节省点云中3D检测的标签成本的重要技术。然而，到目前为止，现有的增强策略旨在仅利用标记的数据，从而限制了数据多样性。在本文中，我们认识到伪标记和数据增强是互补的，因此建议利用未标记的数据来扩展数据以丰富培训数据。特别是，我们设计了三个基于伪标签的新型数据增强策略（伪市），以融合标记和伪标记的场景，包括框架（伪框），objecta（pseudobbox）（pseudobbox）和背景（pseudobackground）。伪市通过减轻伪标记错误并产生多样化的融合训练场景来优于伪标记。我们演示了伪优惠的跨基于点和基于体素的架构，不同的模型容量以及Kitti和Waymo Open DataSet的概括。为了减轻高参数调整和迭代伪标签的成本，我们为3D检测开发了一个基于人群的数据增强框架，名为AutoPseudoAughment。与以前执行伪标记离线的作品不同，我们的框架一次性地进行伪市和超参数调整以降低计算成本。大规模Waymo开放数据集的实验结果表明，我们的方法优于最先进的自动数据增强方法（PPBA）和自我培训方法（伪标记）。特别是，与先前的艺术相比，自动扫描对车辆和行人任务有效约为3倍和2倍数据。值得注意的是，自动测试几乎与完整的数据集培训结果相匹配，车辆检测任务中只有10％的标记运行段。

Data augmentation is an important technique to improve data efficiency and save labeling cost for 3D detection in point clouds. Yet, existing augmentation policies have so far been designed to only utilize labeled data, which limits the data diversity. In this paper, we recognize that pseudo labeling and data augmentation are complementary, thus propose to leverage unlabeled data for data augmentation to enrich the training data. In particular, we design three novel pseudo-label based data augmentation policies (PseudoAugments) to fuse both labeled and pseudo-labeled scenes, including frames (PseudoFrame), objecta (PseudoBBox), and background (PseudoBackground). PseudoAugments outperforms pseudo labeling by mitigating pseudo labeling errors and generating diverse fused training scenes. We demonstrate PseudoAugments generalize across point-based and voxel-based architectures, different model capacity and both KITTI and Waymo Open Dataset. To alleviate the cost of hyperparameter tuning and iterative pseudo labeling, we develop a population-based data augmentation framework for 3D detection, named AutoPseudoAugment. Unlike previous works that perform pseudo-labeling offline, our framework performs PseudoAugments and hyperparameter tuning in one shot to reduce computational cost. Experimental results on the large-scale Waymo Open Dataset show our method outperforms state-of-the-art auto data augmentation method (PPBA) and self-training method (pseudo labeling). In particular, AutoPseudoAugment is about 3X and 2X data efficient on vehicle and pedestrian tasks compared to prior arts. Notably, AutoPseudoAugment nearly matches the full dataset training results, with just 10% of the labeled run segments on the vehicle detection task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题