论文标题
无需替代SGD的紧密收敛速率
On Tight Convergence Rates of Without-replacement SGD
论文作者
论文摘要
为了解决有限和优化问题,无需替换抽样的SGD经验表明表现优于SGD。 $ n $表示成本中的组件数量和$ k $的算法时代的数量,最近的几项工作表明,没有替代SGD的收敛速率比SGD的$ o(1/(nk))$的基线速率更好地依赖于$ n $和$ k $。但是,这些作品之间共有两个主要限制:费率在$ nk $上具有额外的poly-logarithmic因素,并用$κ$表示问题的状况数量,在$κ^c \ log(nk)$ epochs in Bordo $ c> 0 $的情况下,费率保持在$κ^c \ log(nk)$ epochs之后。在这项工作中,我们通过分析各个时期各种的步骤大小来克服这些局限性。
For solving finite-sum optimization problems, SGD without replacement sampling is empirically shown to outperform SGD. Denoting by $n$ the number of components in the cost and $K$ the number of epochs of the algorithm , several recent works have shown convergence rates of without-replacement SGD that have better dependency on $n$ and $K$ than the baseline rate of $O(1/(nK))$ for SGD. However, there are two main limitations shared among those works: the rates have extra poly-logarithmic factors on $nK$, and denoting by $κ$ the condition number of the problem, the rates hold after $κ^c\log(nK)$ epochs for some $c>0$. In this work, we overcome these limitations by analyzing step sizes that vary across epochs.