论文标题
SP2:二阶随机多亚方法
SP2: A Second Order Stochastic Polyak Method
论文作者
论文摘要
最近,“ SP”(随机Polyak步长)方法已成为一种竞争自适应方法,用于设置SGD的步骤尺寸。 SP可以解释为专门针对插值模型的方法,因为它求解了插值方程。 SP通过使用模型的局部线性化来求解这些方程。我们进一步迈出了一步,并开发了一种求解使用模型局部二阶近似的插值方程的方法。我们由此产生的方法SP2使用Hessian-Vector产品来加快Sp的收敛性。此外,在二阶方法中,SP2的设计绝不依赖于正定的Hessian矩阵或目标函数的凸度。我们显示SP2在矩阵完成,非凸测试问题和逻辑回归方面非常具竞争力。我们还提供了关于候选学总和的融合理论。
Recently the "SP" (Stochastic Polyak step size) method has emerged as a competitive adaptive method for setting the step sizes of SGD. SP can be interpreted as a method specialized to interpolated models, since it solves the interpolation equations. SP solves these equation by using local linearizations of the model. We take a step further and develop a method for solving the interpolation equations that uses the local second-order approximation of the model. Our resulting method SP2 uses Hessian-vector products to speed-up the convergence of SP. Furthermore, and rather uniquely among second-order methods, the design of SP2 in no way relies on positive definite Hessian matrices or convexity of the objective function. We show SP2 is very competitive on matrix completion, non-convex test problems and logistic regression. We also provide a convergence theory on sums-of-quadratics.