论文标题
非凸优化的最新理论进步
Recent Theoretical Advances in Non-Convex Optimization
论文作者
论文摘要
由于最近在训练深层神经网络和数据分析中的其他优化问题的非凸优化的优化算法的兴趣增加而引起的,我们概述了针对非convex优化算法的全球性能保证的最新理论结果。我们从经典的论点开始,表明一般的非凸问题无法在合理的时间内有效解决。然后,我们提供一系列问题列表,可以通过尽可能多地利用问题的结构来有效地解决全局最小化。处理非跨性别的另一种方法是放松目标,从找到全球最低限度到找到固定点或当地最低限度。在这种情况下,我们首先提出了确定性一阶方法的收敛速率的已知结果,然后对最佳随机梯度方案进行了一般理论分析,并概述了随机一阶方法。之后,我们讨论了非凸问题的相当一般类别的类别,例如最小化$α$ weakly-weakly-quasi-convex函数和功能,这些功能和功能满足了polyak-lojasiewicz条件,这些函数仍然允许获得一阶方法的理论收敛保证。然后,我们考虑了非凸优化问题的高阶和零级/无衍生方法及其收敛速率。
Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems in data analysis, we give an overview of recent theoretical results on global performance guarantees of optimization algorithms for non-convex optimization. We start with classical arguments showing that general non-convex problems could not be solved efficiently in a reasonable time. Then we give a list of problems that can be solved efficiently to find the global minimizer by exploiting the structure of the problem as much as it is possible. Another way to deal with non-convexity is to relax the goal from finding the global minimum to finding a stationary point or a local minimum. For this setting, we first present known results for the convergence rates of deterministic first-order methods, which are then followed by a general theoretical analysis of optimal stochastic and randomized gradient schemes, and an overview of the stochastic first-order methods. After that, we discuss quite general classes of non-convex problems, such as minimization of $α$-weakly-quasi-convex functions and functions that satisfy Polyak--Lojasiewicz condition, which still allow obtaining theoretical convergence guarantees of first-order methods. Then we consider higher-order and zeroth-order/derivative-free methods and their convergence rates for non-convex optimization problems.