论文标题

对机器学习分类器的偏置缓解方法的全面实证研究

A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers

论文作者

Chen, Zhenpeng, Zhang, Jie M., Sarro, Federica, Harman, Mark

论文摘要

软件偏见是软件工程师越来越重要的操作问题。我们提供了17种用于机器学习的代表性偏见缓解方法(ML)分类器的大规模,全面的经验研究,该方法使用11毫升性能指标(例如准确性),4种公平度量指标和20种类型的公平性 - 表现折衷评估,适用于8个广泛调整软件决策任务。与以前在此重要软件属性上的工作相比,经验覆盖范围更为全面,涵盖了最多的偏见缓解方法,评估指标和公平性绩效折衷措施。我们发现(1)偏置缓解方法可显着降低53%的研究场景(根据不同ML性能指标,介于42%〜66%之间); (2)缓解方法在所有场景中有46%(根据不同的公平度量指标介于24%〜59%之间),通过46%的46%的使用指标衡量了偏差缓解方法; (3)缓解方法甚至导致25%的场景中的公平性和ML表现都会下降; (4)缓解偏差方法的有效性取决于任务,模型,受保护属性的选择以及用于评估公平性和ML性能的指标集; (5)在所有情况下都没有偏置缓解方法可以实现最佳权衡。在30%的方案中,我们发现胜过其他方法的最佳方法。研究人员和从业人员需要选择最适合其预期应用程序方案的缓解偏差方法。

Software bias is an increasingly important operational concern for software engineers. We present a large-scale, comprehensive empirical study of 17 representative bias mitigation methods for Machine Learning (ML) classifiers, evaluated with 11 ML performance metrics (e.g., accuracy), 4 fairness metrics, and 20 types of fairness-performance trade-off assessment, applied to 8 widely-adopted software decision tasks. The empirical coverage is much more comprehensive, covering the largest numbers of bias mitigation methods, evaluation metrics, and fairness-performance trade-off measures compared to previous work on this important software property. We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%~66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%~59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best trade-off in all the scenarios. The best method that we find outperforms other methods in 30% of the scenarios. Researchers and practitioners need to choose the bias mitigation method best suited to their intended application scenario(s).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源