随机线性匪徒对对抗性攻击

论文标题

随机线性匪徒对对抗性攻击

Stochastic Linear Bandits Robust to Adversarial Attacks

论文作者

Bogunovic, Ilija, Losalka, Arpan, Krause, Andreas, Scarlett, Jonathan

论文摘要

我们考虑了一个随机线性匪徒问题，其中奖励不仅会受到随机噪声的影响，而且还受到适当的预算$ c $（即，在整个时间范围内的腐败大幅面）的对抗性攻击（即，上限）。我们提供了两个强大的分阶段消除算法的变体，一种知道$ c $，另一种不知道。在未腐败的情况$ C = 0 $中，这两种变体均表现出近乎最佳的遗憾，而总体上对$ C $具有线性和二次依赖性。我们提出算法独立的下限，表明这些加性项几乎是最佳的。此外，在上下文环境中，我们重新审视了各种不同上下文的设置，并表明一种简单的贪婪算法证明，尽管不进行明确的探索，并且不知道$ c $，却在近乎最理想的添加性遗憾术语中表现出色。

We consider a stochastic linear bandit problem in which the rewards are not only subject to random noise, but also adversarial attacks subject to a suitable budget $C$ (i.e., an upper bound on the sum of corruption magnitudes across the time horizon). We provide two variants of a Robust Phased Elimination algorithm, one that knows $C$ and one that does not. Both variants are shown to attain near-optimal regret in the non-corrupted case $C = 0$, while incurring additional additive terms respectively having a linear and quadratic dependency on $C$ in general. We present algorithm independent lower bounds showing that these additive terms are near-optimal. In addition, in a contextual setting, we revisit a setup of diverse contexts, and show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing $C$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题