通过高分辨率微分方程重新访问加速现象

论文标题

通过高分辨率微分方程重新访问加速现象

Revisiting the acceleration phenomenon via high-resolution differential equations

论文作者

Chen, Shuo, Shi, Bin, Yuan, Ya-xiang

论文摘要

Nesterov的加速梯度下降（NAG）是一阶算法史上的里程碑之一。直到在[Shi等，2022]中提出高分辨率微分方程框架之前，它才成功发现，即加速现象背后的机理是由于梯度校正项引起的。为了加深我们对收敛率的高分辨率微分方程框架的理解，我们继续根据本文中的Lyapunov分析和相位空间表示的技术来研究$ $ $ $ stronglongly凸的功能的NAG。首先，我们重新审视梯度校正方案的证明。与[Chen等，2022]类似，直接的计算简化了证明，并将步骤大小扩大到$ s = 1/l $，并进行了较小的修改。同时，构建Lyapunov函数的方式是原则的。此外，我们还研究了隐式速度方案中的NAG。由于速度迭代的差异，我们发现lyapunov函数是由隐式速度方案构建的，而没有附加项，而迭代差异的计算变得更加简单。与获得的最佳步长一起，NAG隐式速度方案的高分辨率微分方程框架是完美的，并且优于梯度校正方案。

Nesterov's accelerated gradient descent (NAG) is one of the milestones in the history of first-order algorithms. It was not successfully uncovered until the high-resolution differential equation framework was proposed in [Shi et al., 2022] that the mechanism behind the acceleration phenomenon is due to the gradient correction term. To deepen our understanding of the high-resolution differential equation framework on the convergence rate, we continue to investigate NAG for the $μ$-strongly convex function based on the techniques of Lyapunov analysis and phase-space representation in this paper. First, we revisit the proof from the gradient-correction scheme. Similar to [Chen et al., 2022], the straightforward calculation simplifies the proof extremely and enlarges the step size to $s=1/L$ with minor modification. Meanwhile, the way of constructing Lyapunov functions is principled. Furthermore, we also investigate NAG from the implicit-velocity scheme. Due to the difference in the velocity iterates, we find that the Lyapunov function is constructed from the implicit-velocity scheme without the additional term and the calculation of iterative difference becomes simpler. Together with the optimal step size obtained, the high-resolution differential equation framework from the implicit-velocity scheme of NAG is perfect and outperforms the gradient-correction scheme.

下载PDF全文

下载文献需遵守相关版权规定

论文标题