论文标题
快速,简单地修改牛顿的方法,有助于避免鞍点
A fast and simple modification of Newton's method helping to avoid saddle points
论文作者
论文摘要
我们在本文中提出了新的Q-Newton的方法。更新规则在概念上非常简单,例如$ x_ {n+1} = x_n-w_n $其中$ w_n = pr_n = pr_ {a_n,+}(v_n)(v_n)-pr_ {a_n, - }(v_n)(v_n)$,带有$ a_n = \ nabla ^nabla ^2f(x_n)+nabl(x_n)+unabl+nabl | nabl | nabl | nabl | ^2. $ v_n = a_n^{ - 1}。\ nabla f(x_n)$。这里$δ_n$是一个适当的实数,因此$ a_n $是可逆的,$ pr_ {a_n,\ pm} $是由$ a_n $的正(相应负)特征值(相应负)特征值(相应负)特征值生成的向量子空间的投影。 本文的主要结果粗略地说,如果$ f $是$ c^3 $(可以在下面不受限制)和一个序列$ \ {x_n \} $,是由新的Q-Newton的方法从随机的初始点$ x_0 $,{\ bf}中构建的,那么限制点是一个关键点,而限制点也不是一个相同的方法。第一作者最近成功地将回溯线搜索纳入了新的Q-Newton的方法,从而解决了某些(非平滑)成本功能观察到的收敛保证问题。将讨论快速查找单变量杂形函数的零的应用程序。对众所周知的算法(例如BFG)和适应性立方体化进行了各种实验。
We propose in this paper New Q-Newton's method. The update rule is very simple conceptually, for example $x_{n+1}=x_n-w_n$ where $w_n=pr_{A_n,+}(v_n)-pr_{A_n,-}(v_n)$, with $A_n=\nabla ^2f(x_n)+δ_n||\nabla f(x_n)||^2.Id$ and $v_n=A_n^{-1}.\nabla f(x_n)$. Here $δ_n$ is an appropriate real number so that $A_n$ is invertible, and $pr_{A_n,\pm}$ are projections to the vector subspaces generated by eigenvectors of positive (correspondingly negative) eigenvalues of $A_n$. The main result of this paper roughly says that if $f$ is $C^3$ (can be unbounded from below) and a sequence $\{x_n\}$, constructed by the New Q-Newton's method from a random initial point $x_0$, {\bf converges}, then the limit point is a critical point and is not a saddle point, and the convergence rate is the same as that of Newton's method. The first author has recently been successful incorporating Backtracking line search to New Q-Newton's method, thus resolving the convergence guarantee issue observed for some (non-smooth) cost functions. An application to quickly finding zeros of a univariate meromorphic function will be discussed. Various experiments are performed, against well known algorithms such as BFGS and Adaptive Cubic Regularization are presented.