对ML模型中重复解释的游戏理论理解

论文标题

对ML模型中重复解释的游戏理论理解

A Game-theoretic Understanding of Repeated Explanations in ML Models

论文作者

Kumari, Kavita, Jadliwala, Murtuza, Jha, Sumit Kumar, Maiti, Anindya

论文摘要

本文正式对系统之间的战略重复相互作用进行了正式建模，包括机器学习（ML）模型和相关的解释方法，以及通过游戏理论寻求预测/标签的最终用户及其对查询/输入的解释。在这个游戏中，恶意的最终用户必须从战略上决定何时停止查询并试图妥协系统，而系统必须战略性地决定它应该与最终用户共享多少信息（以嘈杂的解释的形式），以及何时停止共享，不知道最终用户的类型（诚实/恶意）。本文使用连续的随机信号游戏框架正式对这种权衡进行了正式建模，并在这种框架内表征了马尔可夫的完美平衡状态。

This paper formally models the strategic repeated interactions between a system, comprising of a machine learning (ML) model and associated explanation method, and an end-user who is seeking a prediction/label and its explanation for a query/input, by means of game theory. In this game, a malicious end-user must strategically decide when to stop querying and attempt to compromise the system, while the system must strategically decide how much information (in the form of noisy explanations) it should share with the end-user and when to stop sharing, all without knowing the type (honest/malicious) of the end-user. This paper formally models this trade-off using a continuous-time stochastic Signaling game framework and characterizes the Markov perfect equilibrium state within such a framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题