Flapai Bird：训练代理商使用加固学习技巧玩Flappy Bird

论文标题

Flapai Bird：训练代理商使用加固学习技巧玩Flappy Bird

FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques

论文作者

Vu, Tai, Tran, Leon

论文摘要

强化学习是自动化游戏的最受欢迎的方法之一。该方法允许代理估算其状态的预期效用，以便在未知环境中采取最佳行动。我们试图将增强学习算法应用于游戏蓬松的鸟。我们通过一些修改来实现SARSA和Q-Learning，例如$ε$ - 果岭政策，离散化和向后更新。我们发现SARSA和Q学习优于基线，定期获得1400+的得分，其中最高的比赛得分为2069。

Reinforcement learning is one of the most popular approaches for automated game playing. This method allows an agent to estimate the expected utility of its state in order to make optimal actions in an unknown environment. We seek to apply reinforcement learning algorithms to the game Flappy Bird. We implement SARSA and Q-Learning with some modifications such as $ε$-greedy policy, discretization and backward updates. We find that SARSA and Q-Learning outperform the baseline, regularly achieving scores of 1400+, with the highest in-game score of 2069.

下载PDF全文

下载文献需遵守相关版权规定

论文标题