基于梯度的激活，以进行准确的无偏见学习

论文标题

基于梯度的激活，以进行准确的无偏见学习

Gradient Based Activations for Accurate Bias-Free Learning

论文作者

Kurmi, Vinod K, Sharma, Rishabh, Sharma, Yash Vardhan, Namboodiri, Vinay P.

论文摘要

机器学习模型中的偏置缓解措施是必须的，但充满挑战。尽管已经提出了几种方法，但一种减轻偏见的观点是通过对抗性学习。歧视者用于识别诸如性别，年龄或种族之类的偏见属性。该鉴别器用于对流，以确保其无法区分偏差属性。这种模型中的主要缺点是，它直接引入了一个精确的权衡，因为歧视者认为对偏见敏感的特征可以与分类相关。在这项工作中，我们解决了问题。我们表明，实际上可以使用一个有偏见的歧视者来改善这种偏见 - 准确性的权衡。具体而言，这是通过使用鉴别器的梯度使用特征掩蔽方法来实现的。我们确保对偏见歧视有利的功能得到了强调，并且在分类过程中提高了公正的特征。我们表明，这种简单的方法可以很好地降低偏差并显着提高准确性。我们在标准基准上评估了建议的模型。我们在维持甚至改善公正性的同时提高了对抗方法的准确性，并且表现出了其他几种最近的方法。

Bias mitigation in machine learning models is imperative, yet challenging. While several approaches have been proposed, one view towards mitigating bias is through adversarial learning. A discriminator is used to identify the bias attributes such as gender, age or race in question. This discriminator is used adversarially to ensure that it cannot distinguish the bias attributes. The main drawback in such a model is that it directly introduces a trade-off with accuracy as the features that the discriminator deems to be sensitive for discrimination of bias could be correlated with classification. In this work we solve the problem. We show that a biased discriminator can actually be used to improve this bias-accuracy tradeoff. Specifically, this is achieved by using a feature masking approach using the discriminator's gradients. We ensure that the features favoured for the bias discrimination are de-emphasized and the unbiased features are enhanced during classification. We show that this simple approach works well to reduce bias as well as improve accuracy significantly. We evaluate the proposed model on standard benchmarks. We improve the accuracy of the adversarial methods while maintaining or even improving the unbiasness and also outperform several other recent methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题