公平的映射

论文标题

Fair mapping

论文作者

Gambs, Sébastien, Ngueveu, Rosin Claude

论文摘要

为了减轻模型中不希望的偏见的影响，几种方法建议预先处理输入数据集，以通过防止敏感属性的推断来减少歧视风险。不幸的是，这些预处理方法中的大多数导致一代新的分布与原始分布截然不同，因此通常会导致不现实的数据。作为副作用，这种新的数据分布意味着需要重新训练现有模型才能进行准确的预测。为了解决这个问题，我们提出了一种新颖的预处理方法，我们将根据保护组的分布转换为所选目标一个，其目标是防止敏感属性的推断，我们将其作为公平映射。更确切地说，我们利用了Wasserstein Gan和Attgan框架的最新作品，以实现数据点的最佳运输以及强制保护属性推断的歧视器的最佳运输。我们提出的方法可以保留数据的可解释性，并且可以在不定义敏感组的情况下使用。此外，我们的方法可以专门建模现有的最新方法，从而提出对这些方法的统一观点。最后，关于真实和合成数据集的一些实验表明，我们的方法能够隐藏敏感属性，同时限制了数据的变形并改善了随后的数据分析任务的公平性。

To mitigate the effects of undesired biases in models, several approaches propose to pre-process the input dataset to reduce the risks of discrimination by preventing the inference of sensitive attributes. Unfortunately, most of these pre-processing methods lead to the generation a new distribution that is very different from the original one, thus often leading to unrealistic data. As a side effect, this new data distribution implies that existing models need to be re-trained to be able to make accurate predictions. To address this issue, we propose a novel pre-processing method, that we coin as fair mapping, based on the transformation of the distribution of protected groups onto a chosen target one, with additional privacy constraints whose objective is to prevent the inference of sensitive attributes. More precisely, we leverage on the recent works of the Wasserstein GAN and AttGAN frameworks to achieve the optimal transport of data points coupled with a discriminator enforcing the protection against attribute inference. Our proposed approach, preserves the interpretability of data and can be used without defining exactly the sensitive groups. In addition, our approach can be specialized to model existing state-of-the-art approaches, thus proposing a unifying view on these methods. Finally, several experiments on real and synthetic datasets demonstrate that our approach is able to hide the sensitive attributes, while limiting the distortion of the data and improving the fairness on subsequent data analysis tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题