论文标题
在国际象棋中学习个人行为的模型
Learning Models of Individual Behavior in Chess
论文作者
论文摘要
在人类可能希望从这些系统中学习,与它们合作或作为合作伙伴互动的情况下,可以捕获类似人类行为的AI系统越来越有用。为了开发以人为导向的AI系统,预测人类行为而不是预测最佳行动的问题已受到了广泛关注。现有的工作集中在总体意义上捕获人类行为,这可能会限制任何特定个人可以从与这些系统互动中获得的收益。我们通过开发国际象棋中人类行为的高度准确的预测模型来扩展这一工作。国际象棋是探索人类互动的丰富领域,因为它结合了一套独特的属性:多年前的AI系统达到了超人的表现,但是人类仍然与他们以及对手和准备工具紧密互动,并且在个人玩家游戏中有大量的记录数据。从迈亚(Maia)开始,该版本的Alphazero经过了对人类人群的培训,我们证明我们可以通过应用一系列微调方法来显着提高特定玩家的举动的预测准确性。此外,我们的个性化模型可用于执行风格测定法 - 预测谁做出了一套给定的动作 - 表明他们在个人层面上捕获了人类的决策。我们的工作展示了一种使AI系统更好地与个人行为保持一致的方法,这可能会导致人类互动的大量改善。
AI systems that can capture human-like behavior are becoming increasingly useful in situations where humans may want to learn from these systems, collaborate with them, or engage with them as partners for an extended duration. In order to develop human-oriented AI systems, the problem of predicting human actions -- as opposed to predicting optimal actions -- has received considerable attention. Existing work has focused on capturing human behavior in an aggregate sense, which potentially limits the benefit any particular individual could gain from interaction with these systems. We extend this line of work by developing highly accurate predictive models of individual human behavior in chess. Chess is a rich domain for exploring human-AI interaction because it combines a unique set of properties: AI systems achieved superhuman performance many years ago, and yet humans still interact with them closely, both as opponents and as preparation tools, and there is an enormous corpus of recorded data on individual player games. Starting with Maia, an open-source version of AlphaZero trained on a population of human players, we demonstrate that we can significantly improve prediction accuracy of a particular player's moves by applying a series of fine-tuning methods. Furthermore, our personalized models can be used to perform stylometry -- predicting who made a given set of moves -- indicating that they capture human decision-making at an individual level. Our work demonstrates a way to bring AI systems into better alignment with the behavior of individual people, which could lead to large improvements in human-AI interaction.