快速速率概括错误界限：主题的变化

论文标题

快速速率概括错误界限：主题的变化

Fast Rate Generalization Error Bounds: Variations on a Theme

论文作者

Wu, Xuetong, Manton, Jonathan H., Aickelin, Uwe, Zhu, Jingge

论文摘要

Russo和Xu发起的最新作品表明，学习算法的概括误差可以受到信息测量的上限。在大多数相关作品中，预期的概括误差的收敛速率是以O（sqrt {lambda/n}）的形式，其中lambda是一些信息理论数量，例如数据样本和学习假设之间的共同信息。但是，与在许多学习情况下O（1/N）的“快速率”相比，这种学习率通常被认为是“慢”的。在这项工作中，我们首先表明平方根并不一定意味着速度缓慢，并且在适当的假设下，仍然可以使用此界限获得快速速率（O（1/n））结果。此外，我们确定了快速速率泛化误差所需的关键条件，我们称之为（ETA，C）中央条件。在这种情况下，我们为特定的学习算法（例如经验风险最小化）提供了有关概括误差和多余风险的信息理论界限，其收敛速率为O（λ/{n}）。最后，给出了分析示例以显示边界的有效性。

A recent line of works, initiated by Russo and Xu, has shown that the generalization error of a learning algorithm can be upper bounded by information measures. In most of the relevant works, the convergence rate of the expected generalization error is in the form of O(sqrt{lambda/n}) where lambda is some information-theoretic quantities such as the mutual information between the data sample and the learned hypothesis. However, such a learning rate is typically considered to be "slow", compared to a "fast rate" of O(1/n) in many learning scenarios. In this work, we first show that the square root does not necessarily imply a slow rate, and a fast rate (O(1/n)) result can still be obtained using this bound under appropriate assumptions. Furthermore, we identify the key conditions needed for the fast rate generalization error, which we call the (eta,c)-central condition. Under this condition, we give information-theoretic bounds on the generalization error and excess risk, with a convergence rate of O(λ/{n}) for specific learning algorithms such as empirical risk minimization. Finally, analytical examples are given to show the effectiveness of the bounds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题