论文标题
绝对沙普利价值
Absolute Shapley Value
论文作者
论文摘要
沙普利价值是合作游戏理论中的一个概念,用于衡量每个参与者的贡献,这是为了纪念劳埃德·沙普利而命名的。 Shapley价值最近在数据市场上应用于薪酬分配,以根据其对模型的贡献。沙普利价值是唯一符合三个理想标准的补偿分配的价值分区方案:群体理性,公平性和添加性。在合作游戏理论中,每个贡献者对每个联盟的边际贡献都是一个非负值的价值。但是,在机器学习模型培训中,每个贡献者(数据元组)对每个联盟(一组数据元组)的边际贡献可能是一个负值,即,由数据集训练的数据集训练的模型的准确性可以低于数据集群训练的模型的准确性。 在本文中,我们研究了如何在计算莎普利价值时处理负边缘贡献的问题。我们探讨了三种哲学:1)取原始价值(原始的沙普利价值); 2)占原始值的较大和零(零沙普利值); 3)占据原始值的绝对值(绝对沙普利值)。 IRIS数据集的实验表明,绝对Shapley值的定义在评估数据重要性方面显着超过了其他两个定义(每个数据元组对训练的模型的贡献)。
Shapley value is a concept in cooperative game theory for measuring the contribution of each participant, which was named in honor of Lloyd Shapley. Shapley value has been recently applied in data marketplaces for compensation allocation based on their contribution to the models. Shapley value is the only value division scheme used for compensation allocation that meets three desirable criteria: group rationality, fairness, and additivity. In cooperative game theory, the marginal contribution of each contributor to each coalition is a nonnegative value. However, in machine learning model training, the marginal contribution of each contributor (data tuple) to each coalition (a set of data tuples) can be a negative value, i.e., the accuracy of the model trained by a dataset with an additional data tuple can be lower than the accuracy of the model trained by the dataset only. In this paper, we investigate the problem of how to handle the negative marginal contribution when computing Shapley value. We explore three philosophies: 1) taking the original value (Original Shapley Value); 2) taking the larger of the original value and zero (Zero Shapley Value); and 3) taking the absolute value of the original value (Absolute Shapley Value). Experiments on Iris dataset demonstrate that the definition of Absolute Shapley Value significantly outperforms the other two definitions in terms of evaluating data importance (the contribution of each data tuple to the trained model).