论文标题

Banach空间上回溯梯度下降方法的一些收敛结果

Some convergent results for Backtracking Gradient Descent method on Banach spaces

论文作者

Truong, Tuyen Trung

论文摘要

我们的主要结果涉及以下条件: {\ bf条件C.}令$ x $为Banach空间。 a $ c^1 $函数$ f:x \ rightarrow \ mathbb {r} $每当$ \ {x_n \} $时,请满足条件c,弱收敛到$ x $和$ x $和$ \ lim _ {n \ rightArlow \ rightarrow \ rightarrow \ infty} || \ nabla f(x_n)f(x_n)| = 0 $,然后$,然后$,$ \ nabl。 我们假设在$ x $和其双$ x^*$之间给出了规范的同构,例如,当$ x $是希尔伯特空间时。 {\ bf theorem。}让$ x $为反射性,完整的Banach空间,$ f:x \ rightarrow \ mathbb {r} $为$ c^2 $函数,满足条件C. ^2f(x)|| <\ infty $。我们在x $中选择一个随机点$ x_0 \,然后通过本地回溯GD过程进行构造(这取决于$ 3 $ hyper-parameters $α,β,δ_0$,请参见以后有关详细信息)序列$ x__ {n+1} = x_n-δ(x_n-δ(x_n)\ nabla f(x_n)\ nabla f(x_n)$。然后我们有: 1)$ \ {x_n \} $的每个集群点,在{\ bf弱}拓扑中,是$ f $的关键点。 2)要么$ \ lim _ {n \ rightArrow \ infty} f(x_n)= - \ infty $或$ \ lim _ {n \ rightArrow \ rightArrow \ infty} || x__ {n+1} -x_n || = 0 $。 3)在这里,我们与较弱的拓扑合作。令$ \ Mathcal {C} $为$ f $的关键点集。假设$ \ Mathcal {C} $具有有界的组件$ a $。令$ \ mathcal {b} $为$ \ {x_n \} $的集群点集。如果$ \ mathcal {b} \ cap a \ not = \ emptySet $,则连接$ \ mathcal {b} \ subset a $和$ \ mathcal {b} $。 4)假设$ x $是可分离的。然后,对于$α,β,δ_0$的通用选择和初始点$ x_0 $,如果序列$ \ {x_n \} $收敛 - 在{\ bf弱}拓扑中,则极限点不能是鞍点。

Our main result concerns the following condition: {\bf Condition C.} Let $X$ be a Banach space. A $C^1$ function $f:X\rightarrow \mathbb{R}$ satisfies Condition C if whenever $\{x_n\}$ weakly converges to $x$ and $\lim _{n\rightarrow\infty}||\nabla f(x_n)||=0$, then $\nabla f(x)=0$. We assume that there is given a canonical isomorphism between $X$ and its dual $X^*$, for example when $X$ is a Hilbert space. {\bf Theorem.} Let $X$ be a reflexive, complete Banach space and $f:X\rightarrow \mathbb{R}$ be a $C^2$ function which satisfies Condition C. Moreover, we assume that for every bounded set $S\subset X$, then $\sup _{x\in S}||\nabla ^2f(x)||<\infty$. We choose a random point $x_0\in X$ and construct by the Local Backtracking GD procedure (which depends on $3$ hyper-parameters $α,β,δ_0$, see later for details) the sequence $x_{n+1}=x_n-δ(x_n)\nabla f(x_n)$. Then we have: 1) Every cluster point of $\{x_n\}$, in the {\bf weak} topology, is a critical point of $f$. 2) Either $\lim _{n\rightarrow\infty}f(x_n)=-\infty$ or $\lim _{n\rightarrow\infty}||x_{n+1}-x_n||=0$. 3) Here we work with the weak topology. Let $\mathcal{C}$ be the set of critical points of $f$. Assume that $\mathcal{C}$ has a bounded component $A$. Let $\mathcal{B}$ be the set of cluster points of $\{x_n\}$. If $\mathcal{B}\cap A\not= \emptyset$, then $\mathcal{B}\subset A$ and $\mathcal{B}$ is connected. 4) Assume that $X$ is separable. Then for generic choices of $α,β,δ_0$ and the initial point $x_0$, if the sequence $\{x_n\}$ converges - in the {\bf weak} topology, then the limit point cannot be a saddle point.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源