只需混合一次：按组插值进行最差的组概括

论文标题

只需混合一次：按组插值进行最差的组概括

Just Mix Once: Worst-group Generalization by Group Interpolation

论文作者

Giannone, Giorgio, Havrylov, Serhii, Massiah, Jordan, Yilmaz, Emine, Jiao, Yunlong

论文摘要

深度学习理论的进步揭示了平均概括如何取决于数据中的表面模式。后果是脆性模型，其性能较差，在测试时组分布的变化。当有小组注释时，我们可以使用强大的优化工具来解决问题。但是，识别和注释是耗时的，尤其是在大型数据集上。最近的一项工作利用了自学和过度采样来改善对没有群体注释的少数群体的概括。我们建议使用针对最差的组概括的混合体的类条件变体统一和概括这些方法。我们的方法只需混合一次（JM1），在学习过程中插入样品，以连续的组混合物来增加训练分布。 JM1是域的不可知论，可以在计算上有效，可以与任何级别的组注释一起使用，并且在PAR或比最新群体概括的最先进的情况下执行更好。此外，我们还简单地解释了JM1为什么工作。

Advances in deep learning theory have revealed how average generalization relies on superficial patterns in data. The consequences are brittle models with poor performance with shift in group distribution at test time. When group annotation is available, we can use robust optimization tools to tackle the problem. However, identification and annotation are time-consuming, especially on large datasets. A recent line of work leverages self-supervision and oversampling to improve generalization on minority groups without group annotation. We propose to unify and generalize these approaches using a class-conditional variant of mixup tailored for worst-group generalization. Our approach, Just Mix Once (JM1), interpolates samples during learning, augmenting the training distribution with a continuous mixture of groups. JM1 is domain agnostic and computationally efficient, can be used with any level of group annotation, and performs on par or better than the state-of-the-art on worst-group generalization. Additionally, we provide a simple explanation of why JM1 works.

下载PDF全文

下载文献需遵守相关版权规定

论文标题