深度学习中的数据分离定律

论文标题

深度学习中的数据分离定律

A Law of Data Separation in Deep Learning

论文作者

He, Hangfeng, Su, Weijie J.

论文摘要

尽管深度学习能够在许多科学领域取得重大进步，但其黑框自然却阻碍了未来人工智能应用程序的建筑设计和高风险决策的解释。我们通过研究深度神经网络如何处理中间层中如何处理数据的基本问题来解决这个问题。我们的发现是一项简单且定量的法律，该法律根据所有层中的班级成员资格进行分类的阶级成员资格，该法律控制着深度的神经网络。该法律表明，每一层都以恒定的几何速率改善数据分离，并且在训练过程中的网络体系结构和数据集集合中观察到其出现。该法律提供了设计体系结构，改善模型鲁棒性和样本外部性能以及解释预测的实用指南。

While deep learning has enabled significant advances in many areas of science, its black-box nature hinders architecture design for future artificial intelligence applications and interpretation for high-stakes decision makings. We addressed this issue by studying the fundamental question of how deep neural networks process data in the intermediate layers. Our finding is a simple and quantitative law that governs how deep neural networks separate data according to class membership throughout all layers for classification. This law shows that each layer improves data separation at a constant geometric rate, and its emergence is observed in a collection of network architectures and datasets during training. This law offers practical guidelines for designing architectures, improving model robustness and out-of-sample performance, as well as interpreting the predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题