论文标题
通过识别数据的偏见,同时改善ML模型公平和性能
Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data
论文作者
论文摘要
机器学习模型建立在包含归因于各种潜在因素的歧视实例的数据集上,导致偏见和不公平的结果。这是一个良好的基础和直观的事实,现有的缓解策略通常会牺牲准确性以确保公平。但是,当AI Engine的预测用于反映收入或运营效率(例如信用风险建模)的决策时,如果可以合理地保留准确性,则企业将期望。维持AI准确性和公平性的这种矛盾的要求激发了我们的研究。在本文中,我们提出了一种新的方法,以同时提高现实范式中ML模型的公平性和准确性。我们工作的本质是一种数据预处理技术,可以检测实例归因于应在培训之前从数据集中删除的特定偏差,我们进一步表明,这种实例删除不会对模型准确性产生不利影响。特别是,我们声称在存在类似特征但受保护属性变化引起的不同标签的问题设置中,数据集中诱发了固有的偏差,可以通过我们的新颖方案来识别和缓解。我们对两个开源数据集进行的实验评估表明,所提出的方法如何减轻偏差以及改善而不是降低准确性,同时为最终用户提供某些控制集。
Machine learning models built on datasets containing discriminative instances attributed to various underlying factors result in biased and unfair outcomes. It's a well founded and intuitive fact that existing bias mitigation strategies often sacrifice accuracy in order to ensure fairness. But when AI engine's prediction is used for decision making which reflects on revenue or operational efficiency such as credit risk modelling, it would be desirable by the business if accuracy can be somehow reasonably preserved. This conflicting requirement of maintaining accuracy and fairness in AI motivates our research. In this paper, we propose a fresh approach for simultaneous improvement of fairness and accuracy of ML models within a realistic paradigm. The essence of our work is a data preprocessing technique that can detect instances ascribing a specific kind of bias that should be removed from the dataset before training and we further show that such instance removal will have no adverse impact on model accuracy. In particular, we claim that in the problem settings where instances exist with similar feature but different labels caused by variation in protected attributes , an inherent bias gets induced in the dataset, which can be identified and mitigated through our novel scheme. Our experimental evaluation on two open-source datasets demonstrates how the proposed method can mitigate bias along with improving rather than degrading accuracy, while offering certain set of control for end user.