神经网络学习中的拓扑障碍

论文标题

神经网络学习中的拓扑障碍

Topological obstructions in neural networks learning

论文作者

Barannikov, Serguei, Voronkova, Daria, Trofimov, Ilya, Korotin, Alexander, Sotnikov, Grigorii, Burnaev, Evgeny

论文摘要

我们将拓扑数据分析方法应用于损失功能，以了解深入的神经网络和深度神经网络的概括属性。我们使用损失函数的摩尔斯复合物将梯度下降轨迹的局部行为与损耗表面的全局性质联系起来。我们定义了神经网络拓扑障碍的评分“ to-Score”，借助稳健的拓扑不变式，损失函数的条形码，从而量化了基于梯度的优化的局部最小值的“不良度”。我们已经进行了实验，以计算这些不变的不同数据集上的完全连接，卷积和重新连接的神经网络：MNIST，时尚MNIST，CIFAR10，CIFAR10，CIFAR100和SVHN。我们的两个主要观察结果如下。首先，随着神经网络深度和宽度的增加，神经网络条形码并得分降低，因此学习的拓扑障碍会减少。其次，在某些情况下，条形码中的最小段的长度与最小值概括误差之间存在着有趣的联系。

We apply topological data analysis methods to loss functions to gain insights into learning of deep neural networks and deep neural networks generalization properties. We use the Morse complex of the loss function to relate the local behavior of gradient descent trajectories with global properties of the loss surface. We define the neural network Topological Obstructions score, "TO-score", with the help of robust topological invariants, barcodes of the loss function, that quantify the "badness" of local minima for gradient-based optimization. We have made experiments for computing these invariants for fully-connected, convolutional and ResNet-like neural networks on different datasets: MNIST, Fashion MNIST, CIFAR10, CIFAR100 and SVHN. Our two principal observations are as follows. Firstly, the neural network barcode and TO score decrease with the increase of the neural network depth and width, thus the topological obstructions to learning diminish. Secondly, in certain situations there is an intriguing connection between the lengths of minima segments in the barcode and the minima generalization errors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题