显着方法的定量评估：一项实验研究

论文标题

显着方法的定量评估：一项实验研究

Quantitative Evaluations on Saliency Methods: An Experimental Study

论文作者

Li, Xiao-Hui, Shi, Yuhan, Li, Haoyang, Bai, Wei, Song, Yuanwei, Cao, Caleb Chen, Chen, Lei

论文摘要

长期以来，人们一直在争论说，可解释的AI（XAI）是一个重要的话题，但缺乏严格的定义和公平的指标。在本文中，我们简要总结了指标的现状，以及基于它们的详尽实验研究，包括忠诚，本地化，假阳性，灵敏度检查和稳定性。通过实验结果，我们得出的结论是，在我们比较的所有方法中，没有单一的解释方法在所有指标中占主导地位。但是，在大多数指标中，梯度加权类激活映射（GRAD-CAM）和随机输入采样（RISE）的表现相当出色。利用一组过滤的指标，我们进一步提出了一个案例研究，以诊断模型的分类基础。在提供指标的全面实验研究的同时，我们还研究了当前指标中遗漏的测量因素，并希望这项有价值的工作可以作为未来研究的指南。

It has been long debated that eXplainable AI (XAI) is an important topic, but it lacks rigorous definition and fair metrics. In this paper, we briefly summarize the status quo of the metrics, along with an exhaustive experimental study based on them, including faithfulness, localization, false-positives, sensitivity check, and stability. With the experimental results, we conclude that among all the methods we compare, no single explanation method dominates others in all metrics. Nonetheless, Gradient-weighted Class Activation Mapping (Grad-CAM) and Randomly Input Sampling for Explanation (RISE) perform fairly well in most of the metrics. Utilizing a set of filtered metrics, we further present a case study to diagnose the classification bases for models. While providing a comprehensive experimental study of metrics, we also examine measuring factors that are missed in current metrics and hope this valuable work could serve as a guide for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题