C-Mixup：改善回归的概括

论文标题

C-Mixup：改善回归的概括

C-Mixup: Improving Generalization in Regression

论文作者

Yao, Huaxiu, Wang, Yiping, Zhang, Linjun, Zou, James, Finn, Chelsea

论文摘要

改善深网的概括是一个重要的开放挑战，尤其是在没有大量数据的领域中。混合算法通过线性插值一对示例及其相应标记来改善概括。这些插值示例增加了原始的训练集。混合在各种分类任务中显示出令人鼓舞的结果，但是对回归中混合的系统分析仍未得到反应。直接在回归标签上使用混合可能会导致任意错误的标签。在本文中，我们提出了一种简单而强大的算法C-Mixup，以改善对回归任务的概括。与挑选训练示例与均匀概率混合的训练示例相反，C-Mixup根据标签的相似性调整了采样概率。我们的理论分析证实，具有标签相似性的c混合性在监督回归和元回归中比香草混合物获得了较小的均方误差，并且使用特征相似性。 C-Mixup的另一个好处是，它可以改善分布的鲁棒性，其中测试分布与训练分布不同。通过选择性地使用相似标签插值示例，它可以减轻与域相关信息的影响，并产生域不变表示。我们在11个数据集上评估了C-Mixup，从表格到视频数据。与最佳先前方法相比，C-Mixup分别提高了分布概括，任务概括和分布外鲁棒性的6.56％，4.76％，5.82％。代码在https://github.com/huaxiuyao/c-mixup上发布。

Improving the generalization of deep networks is an important open challenge, particularly in domains without plentiful data. The mixup algorithm improves generalization by linearly interpolating a pair of examples and their corresponding labels. These interpolated examples augment the original training set. Mixup has shown promising results in various classification tasks, but systematic analysis of mixup in regression remains underexplored. Using mixup directly on regression labels can result in arbitrarily incorrect labels. In this paper, we propose a simple yet powerful algorithm, C-Mixup, to improve generalization on regression tasks. In contrast with vanilla mixup, which picks training examples for mixing with uniform probability, C-Mixup adjusts the sampling probability based on the similarity of the labels. Our theoretical analysis confirms that C-Mixup with label similarity obtains a smaller mean square error in supervised regression and meta-regression than vanilla mixup and using feature similarity. Another benefit of C-Mixup is that it can improve out-of-distribution robustness, where the test distribution is different from the training distribution. By selectively interpolating examples with similar labels, it mitigates the effects of domain-associated information and yields domain-invariant representations. We evaluate C-Mixup on eleven datasets, ranging from tabular to video data. Compared to the best prior approach, C-Mixup achieves 6.56%, 4.76%, 5.82% improvements in in-distribution generalization, task generalization, and out-of-distribution robustness, respectively. Code is released at https://github.com/huaxiuyao/C-Mixup.

下载PDF全文

下载文献需遵守相关版权规定

论文标题