论文标题
在线和可自定义的公平意识学习
Online and Customizable Fairness-aware Learning
论文作者
论文摘要
尽管基于人工智能(AI)的决策系统越来越流行,但已经观察到了对AI决策过程中潜在歧视的重大关注点。例如,预测的分布通常是有偏见的,并且依赖于敏感属性(例如性别和种族)。因此,已经提出了许多方法来开发具有歧视意识的决策系统,这些系统通常基于批处理,并且需要同时提供所有用于模型学习的培训数据。但是,在现实世界中,数据流通常进行即时进行,这需要模型在``到达时''处理每个输入数据,而无需存储和重新处理。此外,数据流也可能随着时间的流逝而发展,这进一步要求该模型能够同时适应非平稳数据分布和时间不断发展的偏见模式,并在准确性和公平性之间进行有效且强大的权衡。在本文中,我们提出了一个新颖的在线决策树的框架,并在数据流中公平,并可能漂移。具体而言,首先,我们提出了两个新颖的公平分裂标准,它们尽可能地编码数据,同时消除对敏感属性的依赖,并在需要时进一步适应具有细粒度控制的非平稳分布。其次,我们提出了两个公平决策树在线增长算法,以满足不同的在线公平决策要求。我们的实验表明,我们的算法能够在大规模和非平稳的流媒体环境中处理歧视,并且在公平和预测性能之间进行了更好的权衡。
While artificial intelligence (AI)-based decision-making systems are increasingly popular, significant concerns on the potential discrimination during the AI decision-making process have been observed. For example, the distribution of predictions is usually biased and dependents on the sensitive attributes (e.g., gender and ethnicity). Numerous approaches have therefore been proposed to develop decision-making systems that are discrimination-conscious by-design, which are typically batch-based and require the simultaneous availability of all the training data for model learning. However, in the real-world, the data streams usually come on the fly which requires the model to process each input data once ``on arrival'' and without the need for storage and reprocessing. In addition, the data streams might also evolve over time, which further requires the model to be able to simultaneously adapt to non-stationary data distributions and time-evolving bias patterns, with an effective and robust trade-off between accuracy and fairness. In this paper, we propose a novel framework of online decision tree with fairness in the data stream with possible distribution drifting. Specifically, first, we propose two novel fairness splitting criteria that encode the data as well as possible, while simultaneously removing dependence on the sensitive attributes, and further adapts to non-stationary distribution with fine-grained control when needed. Second, we propose two fairness decision tree online growth algorithms that fulfills different online fair decision-making requirements. Our experiments show that our algorithms are able to deal with discrimination in massive and non-stationary streaming environments, with a better trade-off between fairness and predictive performance.