重新思考参数计数在深层模型中：有效的维度重新审视

论文标题

重新思考参数计数在深层模型中：有效的维度重新审视

Rethinking Parameter Counting in Deep Models: Effective Dimensionality Revisited

论文作者

Maddox, Wesley J., Benton, Gregory, Wilson, Andrew Gordon

论文摘要

当使用参数计数作为复杂性的代理时，神经网络似乎具有神秘的概括属性。确实，神经网络通常比数据点具有更多的参数，但仍提供良好的概括性能。此外，当我们测量概括是参数的函数时，我们会看到双重下降行为，其中测试误差减小，增加，然后再次减小。我们表明，当通过有效维度的镜头观察时，这些属性中的许多属性都可以理解，这可以测量由数据确定的参数空间的维度。我们将有效的维度与贝叶斯深度学习，模型选择，宽度深度折衷，双重下降和功能多样性的有效维度联系起来，从而使对深层模型中参数和功能之间的相互作用有更丰富的了解。我们还表明，有效维度与基于替代规范和基于平坦的概括度量相比有利。

Neural networks appear to have mysterious generalization properties when using parameter counting as a proxy for complexity. Indeed, neural networks often have many more parameters than there are data points, yet still provide good generalization performance. Moreover, when we measure generalization as a function of parameters, we see double descent behaviour, where the test error decreases, increases, and then again decreases. We show that many of these properties become understandable when viewed through the lens of effective dimensionality, which measures the dimensionality of the parameter space determined by the data. We relate effective dimensionality to posterior contraction in Bayesian deep learning, model selection, width-depth tradeoffs, double descent, and functional diversity in loss surfaces, leading to a richer understanding of the interplay between parameters and functions in deep models. We also show that effective dimensionality compares favourably to alternative norm- and flatness- based generalization measures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题