部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

What does a deep neural network confidently perceive? The effective dimension of high certainty class manifolds and their low confidence boundaries

论文作者

Fort, Stanislav, Cubuk, Ekin Dogus, Ganguli, Surya, Schoenholz, Samuel S.

论文摘要

深度神经网络分类器将每个班级的置信区分配给高置信区域。这些类歧管（CMS）的几何形状被广泛研究并与模型性能密切相关。例如，边缘取决于CM边界。我们利用高斯宽度和戈登的逃生定理的概念可以通过与随机仿射子空间的层压交叉点进行仔细估计CMS及其边界的有效维度。我们显示了CMS的维度，概括和鲁棒性之间的几个连接。特别是我们研究了CM维度如何依赖于1）数据集，2）架构（包括Resnet，WideSnet \＆Vision Transformer），3），3）初始化，4）4）训练阶段，5）训练阶段，5）阶段，6）等级，6）网络宽度，7）集合尺寸，8）标签随机化，9）训练设置和10）对数据腐败的鲁棒性。一张图片总之，表现较高和更健壮的模型具有更高的维度CMS。此外，我们通过CMS的交集提供了关于结合的新观点。我们的代码在https://github.com/stanislavfort/slice-dice-optimize/

Deep neural network classifiers partition input space into high confidence regions for each class. The geometry of these class manifolds (CMs) is widely studied and intimately related to model performance; for example, the margin depends on CM boundaries. We exploit the notions of Gaussian width and Gordon's escape theorem to tractably estimate the effective dimension of CMs and their boundaries through tomographic intersections with random affine subspaces of varying dimension. We show several connections between the dimension of CMs, generalization, and robustness. In particular we investigate how CM dimension depends on 1) the dataset, 2) architecture (including ResNet, WideResNet \& Vision Transformer), 3) initialization, 4) stage of training, 5) class, 6) network width, 7) ensemble size, 8) label randomization, 9) training set size, and 10) robustness to data corruption. Together a picture emerges that higher performing and more robust models have higher dimensional CMs. Moreover, we offer a new perspective on ensembling via intersections of CMs. Our code is at https://github.com/stanislavfort/slice-dice-optimize/

下载PDF全文

下载文献需遵守相关版权规定

论文标题