对抗性成像网训练的CNN的形状和简单性偏差

论文标题

对抗性成像网训练的CNN的形状和简单性偏差

The shape and simplicity biases of adversarially robust ImageNet-trained CNNs

论文作者

Chen, Peijie, Agarwal, Chirag, Nguyen, Anh

论文摘要

在过去的几年中，人类视力与卷积神经网络（CNN）之间越来越多的相似之处。然而，香草CNN通常在推广到对抗性或分布（OOD）示例的概括方面表现出卓越的性能。对抗训练是一种领先的学习算法，用于提高CNN对对抗和OOD数据的鲁棒性；然而，对这些属性的了解知之甚少，特别是形状偏差和内部特征在对抗性CNN中学到的内部特征。在本文中，我们进行了一项彻底的系统研究，以了解形状偏差和一些内部机制，从而使Alexnet，Googlenet和Resnet-50模型的普遍性通过对抗训练进行了培训。我们发现，尽管标准成像网分类器具有较强的纹理偏见，但它们的R对应物很大程度上依赖形状。值得注意的是，对抗性训练在“鲁棒性” CNN的过程中诱导了隐藏的神经元的三个简单偏见。也就是说，R网络中的每个卷积神经元经常会更改以检测（1）像素的细模式，即一种机制，该机制可以阻止高频噪声通过网络；（2）更多较低级别的特征，即纹理和颜色（而不是对象）；（3）输入类型较少。我们的发现揭示了使网络更加稳健的有趣机制，并解释了一些最近的发现，例如R网络为什么从更大的容量中受益（Xie等，2020），并且可以在图像合成中充当强大的图像（Santurkar等人，2019年）。

Increasingly more similarities between human vision and convolutional neural networks (CNNs) have been revealed in the past few years. Yet, vanilla CNNs often fall short in generalizing to adversarial or out-of-distribution (OOD) examples which humans demonstrate superior performance. Adversarial training is a leading learning algorithm for improving the robustness of CNNs on adversarial and OOD data; however, little is known about the properties, specifically the shape bias and internal features learned inside adversarially-robust CNNs. In this paper, we perform a thorough, systematic study to understand the shape bias and some internal mechanisms that enable the generalizability of AlexNet, GoogLeNet, and ResNet-50 models trained via adversarial training. We find that while standard ImageNet classifiers have a strong texture bias, their R counterparts rely heavily on shapes. Remarkably, adversarial training induces three simplicity biases into hidden neurons in the process of "robustifying" CNNs. That is, each convolutional neuron in R networks often changes to detecting (1) pixel-wise smoother patterns, i.e., a mechanism that blocks high-frequency noise from passing through the network; (2) more lower-level features i.e. textures and colors (instead of objects);and (3) fewer types of inputs. Our findings reveal the interesting mechanisms that made networks more adversarially robust and also explain some recent findings e.g., why R networks benefit from a much larger capacity (Xie et al. 2020) and can act as a strong image prior in image synthesis (Santurkar et al. 2019).

下载PDF全文

下载文献需遵守相关版权规定

论文标题