用于培训神经网络的自定进度数据增强

论文标题

用于培训神经网络的自定进度数据增强

Self-paced Data Augmentation for Training Neural Networks

论文作者

Takase, Tomoumi, Karakida, Ryo, Asoh, Hideki

论文摘要

数据增强广泛用于机器学习；但是，即使应该仔细调整的几个因素，也没有建立一种应用数据增强的有效方法。这样的因素之一就是样本适用性，其中涉及选择适合数据增强的样品。一种将数据增强应用于所有培训样品的典型方法无视样本适用性，这可能会降低分类器的性能。为了解决这个问题，我们建议自定进度的增强（SPA）在训练神经网络时自动，动态地选择合适的样品以进行数据扩展。所提出的方法减轻了由于无效的数据增强而导致的概括性能的恶化。我们讨论了拟议的水疗中心相对于课程学习和损失功能不稳定性的理想变化的两个原因。实验结果表明，所提出的水疗中心可以改善概括性能，尤其是当训练样品数量少时。此外，拟议的水疗中心的表现胜过最先进的兰德金方法。

Data augmentation is widely used for machine learning; however, an effective method to apply data augmentation has not been established even though it includes several factors that should be tuned carefully. One such factor is sample suitability, which involves selecting samples that are suitable for data augmentation. A typical method that applies data augmentation to all training samples disregards sample suitability, which may reduce classifier performance. To address this problem, we propose the self-paced augmentation (SPA) to automatically and dynamically select suitable samples for data augmentation when training a neural network. The proposed method mitigates the deterioration of generalization performance caused by ineffective data augmentation. We discuss two reasons the proposed SPA works relative to curriculum learning and desirable changes to loss function instability. Experimental results demonstrate that the proposed SPA can improve the generalization performance, particularly when the number of training samples is small. In addition, the proposed SPA outperforms the state-of-the-art RandAugment method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题