子复合：要更好地理解子群级别的黑匣子模型解释

论文标题

子复合：要更好地理解子群级别的黑匣子模型解释

SUBPLEX: Towards a Better Understanding of Black Box Model Explanations at the Subpopulation Level

论文作者

Yuan, Jun, Chan, Gromit Yeuk-Yin, Barr, Brian, Overton, Kyle, Rees, Kim, Nonato, Luis Gustavo, Bertini, Enrico, Silva, Claudio T.

论文摘要

在做出具有社会影响（例如运输控制，财务活动和医学诊断）的决策时，了解机器学习（ML）模型的解释至关重要。尽管当前的模型解释方法专注于使用本地线性函数近似模型或创建自称模型，从而为每个输入实例提供了解释，但它们并不关注亚群级别的模型解释，这是对数据集中不同子集合聚集的模型解释的理解。为了解决在整个数据集中提供ML模型的解释的挑战，我们提出了Subplex，这是一种视觉分析系统，旨在帮助用户通过亚群视觉分析了解黑框模型解释。 Subplex是通过与机器学习研究人员进行的迭代设计过程设计的，以解决现实生活中的机器学习任务的三种用法方案：模型调试，功能选择和偏置检测。该系统对ML模型解释和交互式可视化应用了新的亚群分析，以探索具有不同粒度级别的数据集上的解释。基于系统，我们进行用户评估，以评估如何理解亚种群级别的解释，从而影响了从用户的角度来解释ML模型的感知过程。我们的结果表明，通过为不同数据组提供模型解释，Subplex鼓励用户产生更巧妙的想法来丰富解释。它还可以帮助用户在编程工作流程和视觉分析工作流程之间获得紧密的集成。最后但并非最不重要的一点是，我们总结了将可视化应用于机器学习解释时观察到的考虑因素。

Understanding the interpretation of machine learning (ML) models has been of paramount importance when making decisions with societal impacts such as transport control, financial activities, and medical diagnosis. While current model interpretation methodologies focus on using locally linear functions to approximate the models or creating self-explanatory models that give explanations to each input instance, they do not focus on model interpretation at the subpopulation level, which is the understanding of model interpretations across different subset aggregations in a dataset. To address the challenges of providing explanations of an ML model across the whole dataset, we propose SUBPLEX, a visual analytics system to help users understand black-box model explanations with subpopulation visual analysis. SUBPLEX is designed through an iterative design process with machine learning researchers to address three usage scenarios of real-life machine learning tasks: model debugging, feature selection, and bias detection. The system applies novel subpopulation analysis on ML model explanations and interactive visualization to explore the explanations on a dataset with different levels of granularity. Based on the system, we conduct user evaluation to assess how understanding the interpretation at a subpopulation level influences the sense-making process of interpreting ML models from a user's perspective. Our results suggest that by providing model explanations for different groups of data, SUBPLEX encourages users to generate more ingenious ideas to enrich the interpretations. It also helps users to acquire a tight integration between programming workflow and visual analytics workflow. Last but not least, we summarize the considerations observed in applying visualization to machine learning interpretations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题