论文标题
二阶最佳算法及其复杂性的算法最小化算法
OFFO minimization algorithms for second-order optimality and their complexity
论文作者
论文摘要
提出了一种以adagagrad为单位的算法,用于平滑不约束优化的算法,其中未评估目标函数,但梯度规范至少降低至少与$ \ calo(1/\ sqrt {k+1})$降低,而二阶最佳态度至少在$ \ calo($ \ calo calo(1/(k k)$ 1/(k k kalo)中,二阶最佳态度至少会收集到零。后一个收敛速率显示出本质上是尖锐的,并且与使用功能和衍生物的评估相同的收敛速率(例如信任区域或自适应型号方法)相同。还描述了一种相关的“发散得分”方法,其基本上敏捷的收敛速率较低。最终讨论了如何以(大量)降低的构成成本获得较弱的二阶最佳保证。
An Adagrad-inspired class of algorithms for smooth unconstrained optimization is presented in which the objective function is never evaluated and yet the gradient norms decrease at least as fast as $\calO(1/\sqrt{k+1})$ while second-order optimality measures converge to zero at least as fast as $\calO(1/(k+1)^{1/3})$. This latter rate of convergence is shown to be essentially sharp and is identical to that known for more standard algorithms (like trust-region or adaptive-regularization methods) using both function and derivatives' evaluations. A related "divergent stepsize" method is also described, whose essentially sharp rate of convergence is slighly inferior. It is finally discussed how to obtain weaker second-order optimality guarantees at a (much) reduced computional cost.