浅层relu网络优化的组合观点

论文标题

浅层relu网络优化的组合观点

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

论文作者

Matena, Michael, Raffel, Colin

论文摘要

在每个训练示例的激活模式上，可以将优化浅层RELU网络的NP硬性问题表征为组合搜索，然后在给定固定的激活模式的情况下存在约束凸问题。我们探讨了这项工作中Relu优化的这种组合方面的含义。我们表明，它可以通过几何和组合对象自然建模，称为地位，其顶点集与可行的激活模式集。这有助于分析，并为进一步的研究奠定了基础。当我们探索最佳损失对训练数据的扰动的敏感性时，我们证明了它的有用性。稍后，我们讨论了Zonotope顶点选择的方法及其与优化的相关性。过度参数化通过使随机选择的顶点更有可能包含一个好的解决方案来有助于训练。然后，我们引入了一个新颖的多项式顶点选择过程，该过程可证明只使用符合数据所需的最小参数数量的两倍，从而选择一个顶点，该顶点包含全局最佳。我们进一步介绍了当地的贪婪搜索启发式启发式，并证明它在参数较低的问题上表现优于梯度下降。

The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. We show that it can be naturally modeled via a geometric and combinatoric object known as a zonotope with its vertex set isomorphic to the set of feasible activation patterns. This assists in analysis and provides a foundation for further research. We demonstrate its usefulness when we explore the sensitivity of the optimal loss to perturbations of the training data. Later we discuss methods of zonotope vertex selection and its relevance to optimization. Overparameterization assists in training by making a randomly chosen vertex more likely to contain a good solution. We then introduce a novel polynomial-time vertex selection procedure that provably picks a vertex containing the global optimum using only double the minimum number of parameters required to fit the data. We further introduce a local greedy search heuristic over zonotope vertices and demonstrate that it outperforms gradient descent on underparameterized problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题