无需替代SGD的紧密收敛速率

论文标题

无需替代SGD的紧密收敛速率

On Tight Convergence Rates of Without-replacement SGD

论文作者

Ahn, Kwangjun, Sra, Suvrit

论文摘要

为了解决有限和优化问题，无需替换抽样的SGD经验表明表现优于SGD。 $ n $表示成本中的组件数量和$ k $的算法时代的数量，最近的几项工作表明，没有替代SGD的收敛速率比SGD的$ o（1/（nk））$的基线速率更好地依赖于$ n $和$ k $。但是，这些作品之间共有两个主要限制：费率在$ nk $上具有额外的poly-logarithmic因素，并用$κ$表示问题的状况数量，在$κ^c \ log（nk）$ epochs in Bordo $ c> 0 $的情况下，费率保持在$κ^c \ log（nk）$ epochs之后。在这项工作中，我们通过分析各个时期各种的步骤大小来克服这些局限性。

For solving finite-sum optimization problems, SGD without replacement sampling is empirically shown to outperform SGD. Denoting by $n$ the number of components in the cost and $K$ the number of epochs of the algorithm , several recent works have shown convergence rates of without-replacement SGD that have better dependency on $n$ and $K$ than the baseline rate of $O(1/(nK))$ for SGD. However, there are two main limitations shared among those works: the rates have extra poly-logarithmic factors on $nK$, and denoting by $κ$ the condition number of the problem, the rates hold after $κ^c\log(nK)$ epochs for some $c>0$. In this work, we overcome these limitations by analyzing step sizes that vary across epochs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题