论文标题

Physarum为可区分的线性编程层和应用

Physarum Powered Differentiable Linear Programming Layers and Applications

论文作者

Meng, Zihang, Ravi, Sathya N., Singh, Vikas

论文摘要

考虑一种学习算法,该算法涉及对优化程序的内部呼吁,例如广义特征值问题,锥编程问题甚至分类。以有效且数值稳定的方式将这种方法纳入可训练的深神经网络(DNN)中并不简单 - 例如,仅仅最近才出现用于特征分类和可区分分类的策略。我们为通用线性编程问题提出了一个有效且可区分的求解器,该求解器可以在DNN中以插头和播放方式用作一层。我们的发展灵感来自于粘液模具(物理)和优化方案(例如最陡下降)之间令人着迷但没有广泛使用的联系。我们描述了我们的开发,并在视频细分任务中显示了我们的求解器在几次学习中的使用。我们审查现有结果并提供了描述其适用于我们用例的技术分析。我们的求解器在第一个任务上使用自定义的投影梯度下降方法可得到,并且在第二个任务上优于可区分的CVXPY-SCS求解器。实验表明,我们的求解器很快收敛,而无需可行的初始点。我们的建议很容易实施,只要学习过程需要快速近似LP的解决方案,就可以轻松地充当层。

Consider a learning algorithm, which involves an internal call to an optimization routine such as a generalized eigenvalue problem, a cone programming problem or even sorting. Integrating such a method as a layer(s) within a trainable deep neural network (DNN) in an efficient and numerically stable way is not straightforward -- for instance, only recently, strategies have emerged for eigendecomposition and differentiable sorting. We propose an efficient and differentiable solver for general linear programming problems which can be used in a plug and play manner within DNNs as a layer. Our development is inspired by a fascinating but not widely used link between dynamics of slime mold (physarum) and optimization schemes such as steepest descent. We describe our development and show the use of our solver in a video segmentation task and meta-learning for few-shot learning. We review the existing results and provide a technical analysis describing its applicability for our use cases. Our solver performs comparably with a customized projected gradient descent method on the first task and outperforms the differentiable CVXPY-SCS solver on the second task. Experiments show that our solver converges quickly without the need for a feasible initial point. Our proposal is easy to implement and can easily serve as layers whenever a learning procedure needs a fast approximate solution to a LP, within a larger network.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源