论文标题
性别刻板印象增强:测量通过排名算法传达的性别偏见
Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by Ranking Algorithms
论文作者
论文摘要
搜索引擎(SE)已显示出在心理学文献中确定的众所周知的性别刻板印象,并相应地影响用户。在单词嵌入式(WES)中发现了类似的偏见,从大型在线语料库中学到了。在这种情况下,我们提出了性别刻板印象加强(GSR)度量,该措施量化了SE支持性别刻板印象的趋势,利用WES中编码的性别相关信息。通过构造有效性的临界镜头,我们验证了对合成和实际收集的拟议措施。随后,我们使用GSR比较广泛使用的信息检索排名算法,包括词汇,语义和神经模型。我们检查是否以及如何基于WES的算法继承基础嵌入的偏差。我们还考虑了文献中提出的WES最常见的偏见方法,并在GSR和共同的绩效指标方面测试了它们的影响。据我们所知,GSR是第一个针对IR的专门定制的措施,能够量化代表性危害。
Search Engines (SE) have been shown to perpetuate well-known gender stereotypes identified in psychology literature and to influence users accordingly. Similar biases were found encoded in Word Embeddings (WEs) learned from large online corpora. In this context, we propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a SE to support gender stereotypes, leveraging gender-related information encoded in WEs. Through the critical lens of construct validity, we validate the proposed measure on synthetic and real collections. Subsequently, we use GSR to compare widely-used Information Retrieval ranking algorithms, including lexical, semantic, and neural models. We check if and how ranking algorithms based on WEs inherit the biases of the underlying embeddings. We also consider the most common debiasing approaches for WEs proposed in the literature and test their impact in terms of GSR and common performance measures. To the best of our knowledge, GSR is the first specifically tailored measure for IR, capable of quantifying representational harms.