耐腐败的高斯流程强盗优化

论文标题

耐腐败的高斯流程强盗优化

Corruption-Tolerant Gaussian Process Bandit Optimization

论文作者

Bogunovic, Ilija, Krause, Andreas, Scarlett, Jonathan

论文摘要

我们认为，基于嘈杂的匪徒反馈，我们考虑了在某些繁殖的内核希尔伯特空间（RKHS）中优化具有有界规范的未知（通常是非凸）函数的问题。我们考虑了这个问题的一种新颖的变体，在这种变体中，观点评估不仅被随机噪声损坏，而且是对抗性腐败的。我们基于高斯工艺方法引入了一种算法快速慢性GP-UCB，在两个标记为“快速”（但不舒适）和“慢速”（但强大的置信度范围）的实例之间随机选择，以及在不确定性下的优化原理。我们提出了一种新颖的理论分析，从而在腐败水平，时间范围和基础内核方面累积了累积的遗憾，我们认为无法改善某些依赖性。我们观察到，需要独特的算法思想，具体取决于是否需要在损坏和未腐败的环境中表现良好，以及是否已知腐败水平。

We consider the problem of optimizing an unknown (typically non-convex) function with a bounded norm in some Reproducing Kernel Hilbert Space (RKHS), based on noisy bandit feedback. We consider a novel variant of this problem in which the point evaluations are not only corrupted by random noise, but also adversarial corruptions. We introduce an algorithm Fast-Slow GP-UCB based on Gaussian process methods, randomized selection between two instances labeled "fast" (but non-robust) and "slow" (but robust), enlarged confidence bounds, and the principle of optimism under uncertainty. We present a novel theoretical analysis upper bounding the cumulative regret in terms of the corruption level, the time horizon, and the underlying kernel, and we argue that certain dependencies cannot be improved. We observe that distinct algorithmic ideas are required depending on whether one is required to perform well in both the corrupted and non-corrupted settings, and whether the corruption level is known or not.

下载PDF全文

下载文献需遵守相关版权规定

论文标题