论文标题
大规模流行控制的深度加强学习
Deep reinforcement learning for large-scale epidemic control
论文作者
论文摘要
传染病的流行病是对公共卫生和全球经济体的重要威胁。然而,由于流行病是非线性和复杂的过程,预防策略的发展仍然是一个具有挑战性的过程。因此,我们研究了一种深入的强化学习方法,以在大流行性流感的背景下自动学习预防策略。首先,我们构建了一个新的流行病学荟萃人群模型,其中有379个补丁(一个用于英国的每个行政区),可充分捕捉大流行性流感的感染过程。我们的模型平衡了复杂性和计算效率,以便可以实现增强学习技术的使用。其次,我们建立了一个基础真理,以便我们可以评估“近端政策优化”算法的性能,以在此流行病学模型的一个地区学习。最后,我们通过进行一个实验来考虑一个大规模的问题,我们旨在学习共同的政策,以控制一个由11个紧密耦合地区的社区中的各个地区,为此无法建立地面真理。该实验表明,深度强化学习可用于在具有较大状态空间的复杂流行病学模型中学习缓解政策。此外,通过该实验,我们证明,在设计预防策略时,可以考虑各个地区之间的协作有一个优势。
Epidemics of infectious diseases are an important threat to public health and global economies. Yet, the development of prevention strategies remains a challenging process, as epidemics are non-linear and complex processes. For this reason, we investigate a deep reinforcement learning approach to automatically learn prevention strategies in the context of pandemic influenza. Firstly, we construct a new epidemiological meta-population model, with 379 patches (one for each administrative district in Great Britain), that adequately captures the infection process of pandemic influenza. Our model balances complexity and computational efficiency such that the use of reinforcement learning techniques becomes attainable. Secondly, we set up a ground truth such that we can evaluate the performance of the 'Proximal Policy Optimization' algorithm to learn in a single district of this epidemiological model. Finally, we consider a large-scale problem, by conducting an experiment where we aim to learn a joint policy to control the districts in a community of 11 tightly coupled districts, for which no ground truth can be established. This experiment shows that deep reinforcement learning can be used to learn mitigation policies in complex epidemiological models with a large state space. Moreover, through this experiment, we demonstrate that there can be an advantage to consider collaboration between districts when designing prevention strategies.