上校的强化学习代理商

论文标题

上校的强化学习代理商

Reinforcement Learning Agents in Colonel Blotto

论文作者

Noel, Joseph Christian G.

论文摘要

模型和游戏是世界的简化表示。有许多不同种类的模型，所有模型的复杂性都不同，世界的哪个方面使我们能够进一步了解我们的理解。在本文中，我们专注于基于代理的模型的特定实例，该模型使用加固学习（RL）来训练代理如何在其环境中采取行动。加强学习代理通常也是马尔可夫过程，这是可以使用的另一种模型。我们在上校的Blotto环境中测试了这种强化学习代理1，并衡量其针对随机代理作为对手的绩效。我们发现RL代理商轻松击败了一个对手，并且当对手的数量增加时仍然表现良好。我们还分析了RL代理，并通过查看其赋予最高和最低Q值的行动来查看其达到的策略。有趣的是，玩多个对手的最佳策略几乎与打单一对手的最佳策略完全相反。

Models and games are simplified representations of the world. There are many different kinds of models, all differing in complexity and which aspect of the world they allow us to further our understanding of. In this paper we focus on a specific instance of agent-based models, which uses reinforcement learning (RL) to train the agent how to act in its environment. Reinforcement learning agents are usually also Markov processes, which is another type of model that can be used. We test this reinforcement learning agent in a Colonel Blotto environment1, and measure its performance against Random agents as its opponent. We find that the RL agent handily beats a single opponent, and still performs quite well when the number of opponents are increased. We also analyze the RL agent and look at what strategies it has arrived by looking at the actions that it has given the highest and lowest Q-values. Interestingly, the optimal strategy for playing multiple opponents is almost the complete opposite of the optimal strategy for playing a single opponent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题