论文标题
用于移动干预措施的网络焦躁不安的多臂土匪
Networked Restless Multi-Armed Bandits for Mobile Interventions
论文作者
论文摘要
由广泛的移动干预问题激励,我们提出并研究具有网络效应的不安的多臂土匪(RMAB)。在我们的模型中,手臂是部分充电并通过图形连接的,因此拉动臂也可以改善相邻的臂状态,从而大大扩展了先前研究的完全充电匪徒而没有网络效果的设置。在移动干预措施中,由于常规的人口运动(例如家庭和工作之间的通勤),网络效应可能会产生。我们表明,RMAB中的网络效应会引起现有解决方案方法无法解释的强奖励耦合。我们提出了一种用于联网RMAB的新解决方案方法,利用了凹陷性能,该凹陷性能是在自然假设下对干预效应结构的自然假设中产生的。我们为理想化的设置提供了足够的条件,以最佳的方法,并证明它在经验上优于使用现实世界图的三个移动干预域中最先进的基线。
Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobile interventions, network effects may arise due to regular population movements (such as commuting between home and work). We show that network effects in RMABs induce strong reward coupling that is not accounted for by existing solution methods. We propose a new solution approach for networked RMABs, exploiting concavity properties which arise under natural assumptions on the structure of intervention effects. We provide sufficient conditions for optimality of our approach in idealized settings and demonstrate that it empirically outperforms state-of-the art baselines in three mobile intervention domains using real-world graphs.