物种主义语言和非人类动物偏见在英语蒙面语言模型中

论文标题

物种主义语言和非人类动物偏见在英语蒙面语言模型中

Speciesist Language and Nonhuman Animal Bias in English Masked Language Models

论文作者

Takeshita, Masashi, Rzepka, Rafal, Araki, Kenji

论文摘要

各种现有研究分析了NLP模型继承了哪些社会偏见。这些偏见可能直接或间接损害人们，因此以前的研究仅关注人类属性。但是，直到最近，还没有关于NLP关于非人类的社会偏见的研究。在本文中，我们分析了非人类动物的偏见，即物种主义偏见，在英语蒙面的语言模型（例如Bert）中固有的偏见。我们使用基于模板的和语料库提取的句子分析了物种主义对46个动物名称的偏见。我们发现，经过训练的蒙版语言模型倾向于将有害单词与非人类动物联系起来，并且有偏见地将物种主义语言用于某些非人类动物名称。我们用于复制实验的代码将在GitHub上提供。

Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper, we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked Language Models such as BERT. We analyzed speciesist bias against 46 animal names using template-based and corpus-extracted sentences containing speciesist (or non-speciesist) language. We found that pre-trained masked language models tend to associate harmful words with nonhuman animals and have a bias toward using speciesist language for some nonhuman animal names. Our code for reproducing the experiments will be made available on GitHub.

下载PDF全文

下载文献需遵守相关版权规定

论文标题