论文标题
感知理论可以预测神经网络的准确性
Perceptron Theory Can Predict the Accuracy of Neural Networks
论文作者
论文摘要
多层神经网络为许多技术分类问题设定了当前的艺术状态。但是,从本质上讲,这些网络仍然是黑匣子分析并预测其性能的黑匣子。在这里,我们为单层感知器开发了一个统计理论,并表明它可以预测具有不同体系结构的各种神经网络的性能。通过概括了现有的理论来分析储层计算模型和连接派模型,用于称为矢量符号体系结构的象征性推理,开发了一种用感知力的一般理论。我们的统计理论提供了三个利用信号统计数据的公式,并越来越多。该公式在分析上是棘手的,但可以通过数值评估。捕获最大细节的描述级别需要随机抽样方法。根据网络模型,较简单的公式已经产生高预测准确性。在三个实验环境中评估了理论预测的质量,这是储层计算文献中回波状态网络(ESN)的记忆任务,浅层随机连接网络的分类数据集以及深层卷积神经网络的Imagenet数据集。我们发现,感知理论的第二个描述级别可以预测ESN类型的性能,这是无法描述的。该理论可以通过应用于其输出层来预测深层神经网络。尽管预测神经网络性能的其他方法通常需要训练估计量模型,但提出的理论仅需要输出神经元中突触后总和分布的前两个矩。感知理论与不依赖培训估计量模型的其他方法相比有利。
Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. The theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. The perceptron theory compares favorably to other methods that do not rely on training an estimator model.