论文标题
二重优化:收敛分析和增强设计
Bilevel Optimization: Convergence Analysis and Enhanced Design
论文作者
论文摘要
二重性优化已成为许多机器学习问题的强大工具,例如元学习,超参数优化和增强学习。在本文中,我们研究了非convex-rong-convex双光线优化问题。为了确定性的二元优化,我们根据近似隐式分化(AID)和迭代分化(ITD)分别为两种流行算法提供了全面的收敛率分析。对于基于辅助的方法,我们从订购,由于选择了更实际的参数以及温暖的开始策略,因此可以改善先前的收敛率分析,而对于基于ITD的方法,我们建立了第一个理论收敛速率。我们的分析还提供了ITD和基于AID的方法之间的定量比较。为了进行随机的双层优化,我们提出了一种名为Stocbio的新型算法,该算法使用有效的Jacobian和Hessian-vector产品计算具有样品高效的高降解量估计器。我们为Stocbio提供了收敛率保证,并表明Stocbio的表现优于最著名的计算复杂性,相对于条件数$κ$和目标准确性$ε$。我们进一步验证了我们的理论结果,并通过对元学习和超参数优化的实验证明了双层优化算法的效率。
Bilevel optimization has arisen as a powerful tool for many machine learning problems such as meta-learning, hyperparameter optimization, and reinforcement learning. In this paper, we investigate the nonconvex-strongly-convex bilevel optimization problem. For deterministic bilevel optimization, we provide a comprehensive convergence rate analysis for two popular algorithms respectively based on approximate implicit differentiation (AID) and iterative differentiation (ITD). For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate. Our analysis also provides a quantitative comparison between ITD and AID based approaches. For stochastic bilevel optimization, we propose a novel algorithm named stocBiO, which features a sample-efficient hypergradient estimator using efficient Jacobian- and Hessian-vector product computations. We provide the convergence rate guarantee for stocBiO, and show that stocBiO outperforms the best known computational complexities orderwisely with respect to the condition number $κ$ and the target accuracy $ε$. We further validate our theoretical results and demonstrate the efficiency of bilevel optimization algorithms by the experiments on meta-learning and hyperparameter optimization.