什么时候不变性在分布外的概括问题中有用？

论文标题

什么时候不变性在分布外的概括问题中有用？

When is invariance useful in an Out-of-Distribution Generalization problem ?

论文作者

Koyama, Masanori, Yamaguchi, Shoichiro

论文摘要

分布外（OOD）概括问题的目的是训练在所有环境中概括的预测因子。该领域中的流行方法使用这样的假设，即这样的预测因子应为\ textit {不变预测指标}，该{不变预测指标}捕获了在环境中保持恒定的机制。尽管这些方法在各种案例研究中都取得了成功，但该假设的理论验证仍然有很大的空间。本文介绍了一组新的理论条件，使一个不变预测变量达到了实现OOD最优性。我们的理论不仅适用于非线性案例，而且还概括了\ citet {rojas2018invariant}中使用的必要条件。我们还从我们的理论中得出了梯度对准算法，并证明了其在MNIST衍生的基准数据集以及由\ citet {aubinlinelareAre}提出的三个\ textIt {不变性单位测试}中的两个。

The goal of Out-of-Distribution (OOD) generalization problem is to train a predictor that generalizes on all environments. Popular approaches in this field use the hypothesis that such a predictor shall be an \textit{invariant predictor} that captures the mechanism that remains constant across environments. While these approaches have been experimentally successful in various case studies, there is still much room for the theoretical validation of this hypothesis. This paper presents a new set of theoretical conditions necessary for an invariant predictor to achieve the OOD optimality. Our theory not only applies to non-linear cases, but also generalizes the necessary condition used in \citet{rojas2018invariant}. We also derive Inter Gradient Alignment algorithm from our theory and demonstrate its competitiveness on MNIST-derived benchmark datasets as well as on two of the three \textit{Invariance Unit Tests} proposed by \citet{aubinlinear}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题