论文标题
锐化的准Newton方法:更快的超线性速率和较大的本地收敛邻居
Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood
论文作者
论文摘要
准Newton方法的非反应分析最近获得了关注。特别是,为(经典)bfgs方法,几项作品已经建立了$ \ mathcal {o}的非反应超线性速率((1/\ sqrt {t})^t)$,它通过利用其牛顿方向近似方法零的错误来利用以下事实。此外,最近提出了BFG的贪婪变体,该变体通过直接近似于黑森(Hessian)而不是牛顿方向来加速其收敛性,并达到局部二次收敛速度。 las,与BFG所需的局部超级线性速率相比,贪婪BFG的局部二次收敛需要更多的更新。这是由于以下事实:在贪婪的BFG中,Hessian是直接近似的,牛顿方向近似可能不如BFG的准确性。在本文中,我们缩小了这一差距,并提出了一种新颖的BFGS方法,该方法具有两者中最好的,因为它利用了BFG和贪婪的BFG的近似想法,以正确地近似牛顿方向和同时的Hessian矩阵。我们的理论结果表明,与贪婪-BFG相比,我们的方法在收敛速率方面均超过BFG和贪婪的BFG,而它达到其二次收敛速率。各种数据集上的数值实验也证实了我们的理论发现。
Non-asymptotic analysis of quasi-Newton methods have gained traction recently. In particular, several works have established a non-asymptotic superlinear rate of $\mathcal{O}((1/\sqrt{t})^t)$ for the (classic) BFGS method by exploiting the fact that its error of Newton direction approximation approaches zero. Moreover, a greedy variant of BFGS was recently proposed which accelerates its convergence by directly approximating the Hessian, instead of the Newton direction, and achieves a fast local quadratic convergence rate. Alas, the local quadratic convergence of Greedy-BFGS requires way more updates compared to the number of iterations that BFGS requires for a local superlinear rate. This is due to the fact that in Greedy-BFGS the Hessian is directly approximated and the Newton direction approximation may not be as accurate as the one for BFGS. In this paper, we close this gap and present a novel BFGS method that has the best of both worlds in that it leverages the approximation ideas of both BFGS and Greedy-BFGS to properly approximate the Newton direction and the Hessian matrix simultaneously. Our theoretical results show that our method out-performs both BFGS and Greedy-BFGS in terms of convergence rate, while it reaches its quadratic convergence rate with fewer steps compared to Greedy-BFGS. Numerical experiments on various datasets also confirm our theoretical findings.