维修：重新归一化的插值修复激活

论文标题

维修：重新归一化的插值修复激活

REPAIR: REnormalizing Permuted Activations for Interpolation Repair

论文作者

Jordan, Keller, Sedghi, Hanie, Saukh, Olga, Entezari, Rahim, Neyshabur, Behnam

论文摘要

在本文中，我们研究了Entezari等人的猜想。（2021）指出，如果考虑了神经网络的置换不变性，那么SGD解决方案之间的线性插值可能没有损失障碍。首先，我们观察到，仅神经元比对方法不足以建立由于现象而导致的SGD溶液之间的低屏障线性连接性，我们称之为方差崩溃：插值深网的激活方差崩溃，导致性能差。接下来，我们提出修复（重新授予插值修复的置换激活），该修复通过重新分组此类插值网络的预归作用来减轻方差崩溃。我们探讨了我们的方法与正常化层，网络宽度和深度的选择之间的相互作用，并证明在神经元排列方法上使用修复会导致各种各样的架构系列和任务的相对屏障相对屏障的相对屏障降低60％-100％。特别是，我们报告了Imagenet上的RESNET50的屏障降低74％，而CIFAR10上的RESNET18的屏障降低了90％。

In this paper we look into the conjecture of Entezari et al. (2021) which states that if the permutation invariance of neural networks is taken into account, then there is likely no loss barrier to the linear interpolation between SGD solutions. First, we observe that neuron alignment methods alone are insufficient to establish low-barrier linear connectivity between SGD solutions due to a phenomenon we call variance collapse: interpolated deep networks suffer a collapse in the variance of their activations, causing poor performance. Next, we propose REPAIR (REnormalizing Permuted Activations for Interpolation Repair) which mitigates variance collapse by rescaling the preactivations of such interpolated networks. We explore the interaction between our method and the choice of normalization layer, network width, and depth, and demonstrate that using REPAIR on top of neuron alignment methods leads to 60%-100% relative barrier reduction across a wide variety of architecture families and tasks. In particular, we report a 74% barrier reduction for ResNet50 on ImageNet and 90% barrier reduction for ResNet18 on CIFAR10.

下载PDF全文

下载文献需遵守相关版权规定

论文标题