论文标题

通过高分辨率微分方程重新访问加速现象

Revisiting the acceleration phenomenon via high-resolution differential equations

论文作者

Chen, Shuo, Shi, Bin, Yuan, Ya-xiang

论文摘要

Nesterov的加速梯度下降(NAG)是一阶算法史上的里程碑之一。直到在[Shi等,2022]中提出高分辨率微分方程框架之前,它才成功发现,即加速现象背后的机理是由于梯度校正项引起的。为了加深我们对收敛率的高分辨率微分方程框架的理解,我们继续根据本文中的Lyapunov分析和相位空间表示的技术来研究$ $ $ $ stronglongly凸的功能的NAG。首先,我们重新审视梯度校正方案的证明。与[Chen等,2022]类似,直接的计算简化了证明,并将步骤大小扩大到$ s = 1/l $,并进行了较小的修改。同时,构建Lyapunov函数的方式是原则的。此外,我们还研究了隐式速度方案中的NAG。由于速度迭代的差异,我们发现lyapunov函数是由隐式速度方案构建的,而没有附加项,而迭代差异的计算变得更加简单。与获得的最佳步长一起,NAG隐式速度方案的高分辨率微分方程框架是完美的,并且优于梯度校正方案。

Nesterov's accelerated gradient descent (NAG) is one of the milestones in the history of first-order algorithms. It was not successfully uncovered until the high-resolution differential equation framework was proposed in [Shi et al., 2022] that the mechanism behind the acceleration phenomenon is due to the gradient correction term. To deepen our understanding of the high-resolution differential equation framework on the convergence rate, we continue to investigate NAG for the $μ$-strongly convex function based on the techniques of Lyapunov analysis and phase-space representation in this paper. First, we revisit the proof from the gradient-correction scheme. Similar to [Chen et al., 2022], the straightforward calculation simplifies the proof extremely and enlarges the step size to $s=1/L$ with minor modification. Meanwhile, the way of constructing Lyapunov functions is principled. Furthermore, we also investigate NAG from the implicit-velocity scheme. Due to the difference in the velocity iterates, we find that the Lyapunov function is constructed from the implicit-velocity scheme without the additional term and the calculation of iterative difference becomes simpler. Together with the optimal step size obtained, the high-resolution differential equation framework from the implicit-velocity scheme of NAG is perfect and outperforms the gradient-correction scheme.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源