论文标题
通过遗传算法进行解释引导的公平测试
Explanation-Guided Fairness Testing through Genetic Algorithm
论文作者
论文摘要
公平特征是受信任的AI系统的关键属性。大量的研究提出了各种公平测试的方法。但是,它们遭受了三个主要局限性,即低效率,低效率和模型特异性。这项工作提出了Expga,这是通过遗传算法(GA)的解释引导的公平测试方法。 EXPGA采用可解释方法产生的解释结果来收集高质量的初始种子,这些种子很容易通过稍微修改特征值来得出歧视性样本。然后,Expga通过优化健身价值来采用GA来搜索歧视性样本候选者。从这种解释结果和GA的组合中受益,EXPGA既有效率又有效地检测歧视性个体。此外,EXPGA仅需要测试模型的预测概率,从而使对各种模型具有更好的概括能力。在包括表格和文本数据集在内的多个现实世界基准的实验表明,与四种最先进的方法相比,EXPGA具有更高的效率和有效性。
The fairness characteristic is a critical attribute of trusted AI systems. A plethora of research has proposed diverse methods for individual fairness testing. However, they are suffering from three major limitations, i.e., low efficiency, low effectiveness, and model-specificity. This work proposes ExpGA, an explanationguided fairness testing approach through a genetic algorithm (GA). ExpGA employs the explanation results generated by interpretable methods to collect high-quality initial seeds, which are prone to derive discriminatory samples by slightly modifying feature values. ExpGA then adopts GA to search discriminatory sample candidates by optimizing a fitness value. Benefiting from this combination of explanation results and GA, ExpGA is both efficient and effective to detect discriminatory individuals. Moreover, ExpGA only requires prediction probabilities of the tested model, resulting in a better generalization capability to various models. Experiments on multiple real-world benchmarks, including tabular and text datasets, show that ExpGA presents higher efficiency and effectiveness than four state-of-the-art approaches.