通过特征金字塔解码器增强内在的对抗鲁棒性

论文标题

通过特征金字塔解码器增强内在的对抗鲁棒性

Enhancing Intrinsic Adversarial Robustness via Feature Pyramid Decoder

论文作者

Li, Guanlin, Ding, Shuya, Luo, Jun, Liu, Chang

论文摘要

尽管采用对抗训练作为针对特定对抗样本的主要防御策略，但它具有有限的概括能力，并带来了过度的时间复杂性。在本文中，我们提出了一个攻击无形的防御框架，以增强神经网络的内在鲁棒性，而不会危害概括清洁样品的能力。我们的特征金字塔解码器（FPD）框架适用于所有基于块的卷积神经网络（CNN）。它将固定和图像恢复模块植入目标CNN中，并且还限制了分类层的Lipschitz常数。此外，我们提出了一种两阶段的策略来训练FPD增强的CNN，利用$ε$ -Neighbourhood嘈杂的图像具有多任务和自我监督的学习。对各种白盒和黑盒攻击进行了评估，我们证明了FPD增强的CNN对MNIST，SVHN和CALTECH上的一般对抗样品具有足够的鲁棒性。此外，如果我们进一步进行对抗训练，则FPD增强的CNN的性能要比其非增强版本更好。

Whereas adversarial training is employed as the main defence strategy against specific adversarial samples, it has limited generalization capability and incurs excessive time complexity. In this paper, we propose an attack-agnostic defence framework to enhance the intrinsic robustness of neural networks, without jeopardizing the ability of generalizing clean samples. Our Feature Pyramid Decoder (FPD) framework applies to all block-based convolutional neural networks (CNNs). It implants denoising and image restoration modules into a targeted CNN, and it also constraints the Lipschitz constant of the classification layer. Moreover, we propose a two-phase strategy to train the FPD-enhanced CNN, utilizing $ε$-neighbourhood noisy images with multi-task and self-supervised learning. Evaluated against a variety of white-box and black-box attacks, we demonstrate that FPD-enhanced CNNs gain sufficient robustness against general adversarial samples on MNIST, SVHN and CALTECH. In addition, if we further conduct adversarial training, the FPD-enhanced CNNs perform better than their non-enhanced versions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题