论文标题

要理解NLP缓解的偏差相关性

Toward Understanding Bias Correlations for Mitigation in NLP

论文作者

Cheng, Lu, Ge, Suyu, Liu, Huan

论文摘要

自然语言处理(NLP)模型已被发现针对不同社会身份(例如性别和种族)的群体。由于这些不希望的偏见的负面后果,研究人员以前所未有的努力做出了反应,并提出了有前途的缓解方法。尽管具有相当实际的重要性,但目前的算法公平文献缺乏对不同形式偏见之间关系的深入理解。社会偏见本质上很复杂。社会心理学上的许多研究都确定了“普遍的偏见”,即不同群体之间的普遍贬值情绪。例如,贬低少数民族的人也可能会贬值妇女和同性恋者。因此,这项工作旨在提供第一个系统的研究,以了解缓解中的偏见相关性。特别是,我们在两个社会身份(即种族,性别和宗教)上研究了两个常见的NLP任务(毒性检测和单词嵌入)中的缓解偏见。我们的发现表明,偏见是相关的,当前的情况是,在当前文献中占主导地位的独立证词方法可能不足。我们进一步研究共同缓解相关偏见是否比独立和个人偏见更需要。最后,我们阐明了缓解偏见的固有问题固有的问题。这项研究旨在激发对相关偏见的关节偏见缓解的未来研究。

Natural Language Processing (NLP) models have been found discriminative against groups of different social identities such as gender and race. With the negative consequences of these undesired biases, researchers have responded with unprecedented effort and proposed promising approaches for bias mitigation. In spite of considerable practical importance, current algorithmic fairness literature lacks an in-depth understanding of the relations between different forms of biases. Social bias is complex by nature. Numerous studies in social psychology identify the "generalized prejudice", i.e., generalized devaluing sentiments across different groups. For example, people who devalue ethnic minorities are also likely to devalue women and gays. Therefore, this work aims to provide a first systematic study toward understanding bias correlations in mitigation. In particular, we examine bias mitigation in two common NLP tasks -- toxicity detection and word embeddings -- on three social identities, i.e., race, gender, and religion. Our findings suggest that biases are correlated and present scenarios in which independent debiasing approaches dominant in current literature may be insufficient. We further investigate whether jointly mitigating correlated biases is more desired than independent and individual debiasing. Lastly, we shed light on the inherent issue of debiasing-accuracy trade-off in bias mitigation. This study serves to motivate future research on joint bias mitigation that accounts for correlated biases.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源