论文标题
人类对基于AI的决策支持系统的反应:用户研究准确性和偏见的影响
Human Response to an AI-Based Decision Support System: A User Study on the Effects of Accuracy and Bias
论文作者
论文摘要
人工智能(AI)越来越多地用于在许多领域建立决策支持系统(DSS)。本文介绍了一系列实验,旨在观察人类对DSS不同特征(例如准确性和偏见)的反应,尤其是参与者依赖DSS及其实现的性能的程度。在我们的实验中,参与者玩的是一个简单的在线游戏,其灵感来自所谓的“野猫”(即探索性)油。景观有两个层:一个可见的层,描述了成本(地形)和一个描述奖励的隐藏层。对照组的参与者在没有获得任何帮助的情况下玩游戏,而在治疗组中,他们得到了DSS建议的钻探场所的协助。对于某些治疗方法,DSS不考虑成本,而仅考虑奖励,这引入了用户可观察到的偏见。在受试者之间,我们改变了DSS的准确性和偏见,并观察参与者的总得分,完成的时间,他们遵循或忽略建议的程度。我们还在出口调查中衡量了DSS的可接受性。我们的结果表明,参与者倾向于与DSS得分更好,得分提高是由于用户遵循DSS建议,并且与游戏的难度和DSS的准确性有关。我们观察到,这种设置主要引起参与者的理性行为,他们对DSS提出了适度的信任,并且既不表现出算法厌恶(不足依赖)也不是自动化偏见(过于盟军)。持续的愿意,他们愿意接受DSS在Exit的调查中的敏感性不太敏感。 DSS。
Artificial Intelligence (AI) is increasingly used to build Decision Support Systems (DSS) across many domains. This paper describes a series of experiments designed to observe human response to different characteristics of a DSS such as accuracy and bias, particularly the extent to which participants rely on the DSS, and the performance they achieve. In our experiments, participants play a simple online game inspired by so-called "wildcat" (i.e., exploratory) drilling for oil. The landscape has two layers: a visible layer describing the costs (terrain), and a hidden layer describing the reward (oil yield). Participants in the control group play the game without receiving any assistance, while in treatment groups they are assisted by a DSS suggesting places to drill. For certain treatments, the DSS does not consider costs, but only rewards, which introduces a bias that is observable by users. Between subjects, we vary the accuracy and bias of the DSS, and observe the participants' total score, time to completion, the extent to which they follow or ignore suggestions. We also measure the acceptability of the DSS in an exit survey. Our results show that participants tend to score better with the DSS, that the score increase is due to users following the DSS advice, and related to the difficulty of the game and the accuracy of the DSS. We observe that this setting elicits mostly rational behavior from participants, who place a moderate amount of trust in the DSS and show neither algorithmic aversion (under-reliance) nor automation bias (over-reliance).However, their stated willingness to accept the DSS in the exit survey seems less sensitive to the accuracy of the DSS than their behavior, suggesting that users are only partially aware of the (lack of) accuracy of the DSS.