论文标题

遗憾的无噪声级联的内核土匪的界限

Regret Bounds for Noise-Free Cascaded Kernelized Bandits

论文作者

Li, Zihan, Scarlett, Jonathan

论文摘要

我们考虑使用RKHS功能类优化无噪声的灰色盒设置中的功能网络,其中可以观察到确切的中间结果。 We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions.我们提出了一种基于连续置信度结合的算法GPN-UCB,并在累积遗憾的情况下进行了一般的理论上限。此外,我们提出了一种基于非自适应抽样的方法,以及其理论上的上限,这是对Matérn内核的简单遗憾。我们还以简单的遗憾和累积的遗憾提供算法独立的下限。我们对GPN-UCB的遗憾界限对时间范围的依赖性与香草黑框设置中最著名的范围相同,并且对其他参数(例如RKHS Norm和网络长度)的近乎最佳依赖性。

We consider optimizing a function network in the noise-free grey-box setting with RKHS function classes, where the exact intermediate results are observable. We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions. We propose a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret. In addition, we propose a non-adaptive sampling based method along with its theoretical upper bound on the simple regret for the Matérn kernel. We also provide algorithm-independent lower bounds on the simple regret and cumulative regret. Our regret bounds for GPN-UCB have the same dependence on the time horizon as the best known in the vanilla black-box setting, as well as near-optimal dependencies on other parameters (e.g., RKHS norm and network length).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源