探索性别偏见缓解性别偏差中的线性子空间假设

论文标题

探索性别偏见缓解性别偏差中的线性子空间假设

Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation

论文作者

Vargas, Francisco, Cotterell, Ryan

论文摘要

Bolukbasi等。（2016年）介绍了单词表示的第一个性别缓解技术之一。他们的方法将预训练的单词表示为输入，并试图隔离一个线性子空间，以捕获表示表示中的大多数性别偏见。按照类似评估任务的判断，它们的方法实际上消除了表示形式中的性别偏见。但是，其方法的隐式假设是偏置子空间实际上是线性的。在这项工作中，我们将它们的方法推广到内核化的非线性版本。我们从内核主成分分析中汲取灵感，并得出非线性偏置隔离技术。我们讨论并克服了我们方法在单词表示中进行非线性性别偏见的方法的一些实际缺点，并经验分析偏见子空间是否真的是线性的。我们的分析表明，性别偏见实际上是由线性子空间很好地捕获的，证明了Bolukbasi等人的假设是合理的。（2016）。

Bolukbasi et al. (2016) presents one of the first gender bias mitigation techniques for word representations. Their method takes pre-trained word representations as input and attempts to isolate a linear subspace that captures most of the gender bias in the representations. As judged by an analogical evaluation task, their method virtually eliminates gender bias in the representations. However, an implicit and untested assumption of their method is that the bias subspace is actually linear. In this work, we generalize their method to a kernelized, nonlinear version. We take inspiration from kernel principal component analysis and derive a nonlinear bias isolation technique. We discuss and overcome some of the practical drawbacks of our method for non-linear gender bias mitigation in word representations and analyze empirically whether the bias subspace is actually linear. Our analysis shows that gender bias is in fact well captured by a linear subspace, justifying the assumption of Bolukbasi et al. (2016).

下载PDF全文

下载文献需遵守相关版权规定

论文标题