锐化的准Newton方法：更快的超线性速率和较大的本地收敛邻居

论文标题

锐化的准Newton方法：更快的超线性速率和较大的本地收敛邻居

Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood

论文作者

Jin, Qiujiang, Koppel, Alec, Rajawat, Ketan, Mokhtari, Aryan

论文摘要

准Newton方法的非反应分析最近获得了关注。特别是，为（经典）bfgs方法，几项作品已经建立了$ \ mathcal {o}的非反应超线性速率（（1/\ sqrt {t}）^t）$，它通过利用其牛顿方向近似方法零的错误来利用以下事实。此外，最近提出了BFG的贪婪变体，该变体通过直接近似于黑森（Hessian）而不是牛顿方向来加速其收敛性，并达到局部二次收敛速度。 las，与BFG所需的局部超级线性速率相比，贪婪BFG的局部二次收敛需要更多的更新。这是由于以下事实：在贪婪的BFG中，Hessian是直接近似的，牛顿方向近似可能不如BFG的准确性。在本文中，我们缩小了这一差距，并提出了一种新颖的BFGS方法，该方法具有两者中最好的，因为它利用了BFG和贪婪的BFG的近似想法，以正确地近似牛顿方向和同时的Hessian矩阵。我们的理论结果表明，与贪婪-BFG相比，我们的方法在收敛速率方面均超过BFG和贪婪的BFG，而它达到其二次收敛速率。各种数据集上的数值实验也证实了我们的理论发现。

Non-asymptotic analysis of quasi-Newton methods have gained traction recently. In particular, several works have established a non-asymptotic superlinear rate of $\mathcal{O}((1/\sqrt{t})^t)$ for the (classic) BFGS method by exploiting the fact that its error of Newton direction approximation approaches zero. Moreover, a greedy variant of BFGS was recently proposed which accelerates its convergence by directly approximating the Hessian, instead of the Newton direction, and achieves a fast local quadratic convergence rate. Alas, the local quadratic convergence of Greedy-BFGS requires way more updates compared to the number of iterations that BFGS requires for a local superlinear rate. This is due to the fact that in Greedy-BFGS the Hessian is directly approximated and the Newton direction approximation may not be as accurate as the one for BFGS. In this paper, we close this gap and present a novel BFGS method that has the best of both worlds in that it leverages the approximation ideas of both BFGS and Greedy-BFGS to properly approximate the Newton direction and the Hessian matrix simultaneously. Our theoretical results show that our method out-performs both BFGS and Greedy-BFGS in terms of convergence rate, while it reaches its quadratic convergence rate with fewer steps compared to Greedy-BFGS. Numerical experiments on various datasets also confirm our theoretical findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题