预先培训的语言模型的有效性别歧义

论文标题

预先培训的语言模型的有效性别歧义

Efficient Gender Debiasing of Pre-trained Indic Language Models

论文作者

Kirtane, Neeraja, Manushree, V, Kane, Aditya

论文摘要

在使用这些模型的系统中反映了有关哪些语言模型的数据中存在的性别偏差。该模型的内在性别偏见显示了我们文化中妇女的过时和不平等的看法，并鼓励歧视。因此，为了建立更公平的系统并提高公平性，识别和减轻这些模型中存在的偏见至关重要。尽管这一领域的英语工作大量工作，但在其他性别和低资源语言，尤其是印度语言的其他性别和低资源语言中，缺乏研究。英语是一种非性别语言，它具有无性别名词。英语中偏见检测的方法论不能直接用其他性别语言，语法和语义有所不同。在我们的论文中，我们衡量与印地语语言模型中职业相关的性别偏见。我们在本文中的主要贡献是构建一种新型语料库，以评估印地语中的职业性别偏见，使用定义明确的度量来量化这些系统中现有的偏见，并通过有效地微调我们的模型来减轻它。我们的结果反映出，我们提出的缓解技术的引入后减少了偏见。我们的代码库可公开使用。

The gender bias present in the data on which language models are pre-trained gets reflected in the systems that use these models. The model's intrinsic gender bias shows an outdated and unequal view of women in our culture and encourages discrimination. Therefore, in order to establish more equitable systems and increase fairness, it is crucial to identify and mitigate the bias existing in these models. While there is a significant amount of work in this area in English, there is a dearth of research being done in other gendered and low resources languages, particularly the Indian languages. English is a non-gendered language, where it has genderless nouns. The methodologies for bias detection in English cannot be directly deployed in other gendered languages, where the syntax and semantics vary. In our paper, we measure gender bias associated with occupations in Hindi language models. Our major contributions in this paper are the construction of a novel corpus to evaluate occupational gender bias in Hindi, quantify this existing bias in these systems using a well-defined metric, and mitigate it by efficiently fine-tuning our model. Our results reflect that the bias is reduced post-introduction of our proposed mitigation techniques. Our codebase is available publicly.

下载PDF全文

下载文献需遵守相关版权规定

论文标题