论文标题
正则化有助于减轻中毒攻击:使用Wasserstein距离的分配机器机器学习
Regularization Helps with Mitigating Poisoning Attacks: Distributionally-Robust Machine Learning Using the Wasserstein Distance
论文作者
论文摘要
我们使用分布良好的优化进行机器学习来减轻数据中毒攻击的影响。我们通过训练模型在经验分布周围的社区(从训练数据集中提取的中毒攻击损坏的训练数据集)训练模型,为训练有素的模型提供了训练有素的模型的性能保证。我们通过根据经验抽样平均健身和适应性函数的Lipschitz-contant(在给定模型参数的数据上)作为常规器找到了针对最坏情况下的适应性的上限来放松分配机器的学习问题。对于回归模型,我们证明该正常化程序等于模型参数的双重规范。我们使用葡萄酒质量数据集,波士顿住房市场数据集和成人数据集来证明本文的结果。
We use distributionally-robust optimization for machine learning to mitigate the effect of data poisoning attacks. We provide performance guarantees for the trained model on the original data (not including the poison records) by training the model for the worst-case distribution on a neighbourhood around the empirical distribution (extracted from the training dataset corrupted by a poisoning attack) defined using the Wasserstein distance. We relax the distributionally-robust machine learning problem by finding an upper bound for the worst-case fitness based on the empirical sampled-averaged fitness and the Lipschitz-constant of the fitness function (on the data for given model parameters) as regularizer. For regression models, we prove that this regularizer is equal to the dual norm of the model parameters. We use the Wine Quality dataset, the Boston Housing Market dataset, and the Adult dataset for demonstrating the results of this paper.