后差正规化和F差异以改善模型鲁棒性

论文标题

后差正规化和F差异以改善模型鲁棒性

Posterior Differential Regularization with f-divergence for Improving Model Robustness

论文作者

Cheng, Hao, Liu, Xiaodong, Pereira, Lis, Yu, Yaoliang, Gao, Jianfeng

论文摘要

我们解决了通过正则化增强模型鲁棒性的问题。具体而言，我们专注于将清洁和嘈杂输入之间模型后差正规化的方法。从理论上讲，我们在此框架下提供了两种最近的方法，即雅各布式的正则化和虚拟对抗训练。此外，我们将后差正规化概括为$ f $ diverences的家族，并以雅各布矩阵来表征整体正则化框架。从经验上讲，我们会系统地比较各种任务的正规化和标准的BERT培训，以全面了解它们对模型内域内和室外概括的影响。对于完全有监督的和半监督的设置，我们的实验表明，将后差与$ f $ divergence正规化可以导致良好的模型鲁棒性。特别是，借助适当的$ f $ divergence，Bert-Base模型可以实现可比的概括，作为其内域，对抗性和域移动方案的Bert-Large对应物，这表明了提出的NLP模型概括框架的巨大潜力。

We address the problem of enhancing model robustness through regularization. Specifically, we focus on methods that regularize the model posterior difference between clean and noisy inputs. Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework. Additionally, we generalize the posterior differential regularization to the family of $f$-divergences and characterize the overall regularization framework in terms of Jacobian matrix. Empirically, we systematically compare those regularizations and standard BERT training on a diverse set of tasks to provide a comprehensive profile of their effect on model in-domain and out-of-domain generalization. For both fully supervised and semi-supervised settings, our experiments show that regularizing the posterior differential with $f$-divergence can result in well-improved model robustness. In particular, with a proper $f$-divergence, a BERT-base model can achieve comparable generalization as its BERT-large counterpart for in-domain, adversarial and domain shift scenarios, indicating the great potential of the proposed framework for boosting model generalization for NLP models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题