论文标题
自适应遗憾保证的新的无投影算法用于在线凸优化
New Projection-free Algorithms for Online Convex Optimization with Adaptive Regret Guarantees
论文作者
论文摘要
我们提出了用于在线凸优化(OCO)的新有效\ textIt {无投影}算法,其中通过无投影,我们指的是避免将正交投影计算到可行集合上的算法,而是在不同的,潜在的效率更高的牙文上转移到可行的集合上。虽然大多数最先进的无投影算法基于\ textit {last-the-the-Leader}框架,但我们的算法在根本上是不同的,并且基于\ textIt {在线渐变下降}算法,采用新颖且有效的方法来计算所谓的textIt \ textit \ textit \ textit \ textit {inspible venterions}。结果,我们获得了第一个自然产生\ textit {自适应后悔}的第一个无投影算法,即保证,即遗憾的是持有W.R.T.的遗憾界限。序列的任何子间隙。具体而言,当假设可行的集合的线性优化甲骨文(LOO)时,按一系列$ t $的顺序,我们的算法保证$ o(t^{3/4})$适应性遗憾和$ o(t^{3/4})$适应性预期的遗憾,仅适用于全部$ loo,仅使用$ $ notity $ notive $ noy $ note $ note $ noce $ noce $均可$ noce $(compliity-noce y loo)。这些界限匹配了当前的最新遗憾范围,用于基于oo的投影的OCO,它是\ textit {不自适应}。我们还考虑了一种新的自然环境,其中可行的集合可以通过分离的甲骨文访问。我们提出算法,使用总体$ O(t)$调用分离甲骨文,请保证$ O(\ sqrt {t})$自适应遗憾和$ O(T^{3/4})$适应性预期的遗憾,分别为全面信息和强盗设置。
We present new efficient \textit{projection-free} algorithms for online convex optimization (OCO), where by projection-free we refer to algorithms that avoid computing orthogonal projections onto the feasible set, and instead relay on different and potentially much more efficient oracles. While most state-of-the-art projection-free algorithms are based on the \textit{follow-the-leader} framework, our algorithms are fundamentally different and are based on the \textit{online gradient descent} algorithm with a novel and efficient approach to computing so-called \textit{infeasible projections}. As a consequence, we obtain the first projection-free algorithms which naturally yield \textit{adaptive regret} guarantees, i.e., regret bounds that hold w.r.t. any sub-interval of the sequence. Concretely, when assuming the availability of a linear optimization oracle (LOO) for the feasible set, on a sequence of length $T$, our algorithms guarantee $O(T^{3/4})$ adaptive regret and $O(T^{3/4})$ adaptive expected regret, for the full-information and bandit settings, respectively, using only $O(T)$ calls to the LOO. These bounds match the current state-of-the-art regret bounds for LOO-based projection-free OCO, which are \textit{not adaptive}. We also consider a new natural setting in which the feasible set is accessible through a separation oracle. We present algorithms which, using overall $O(T)$ calls to the separation oracle, guarantee $O(\sqrt{T})$ adaptive regret and $O(T^{3/4})$ adaptive expected regret for the full-information and bandit settings, respectively.