通过隐式差异来扩展和稳定可区分计划

论文标题

通过隐式差异来扩展和稳定可区分计划

Scaling up and Stabilizing Differentiable Planning with Implicit Differentiation

论文作者

Zhao, Linfeng, Xu, Huazhe, Wong, Lawson L. S.

论文摘要

可区分的计划承诺端到端的不同性和适应性。但是，一个问题可以防止其扩展到更大范围的问题：他们需要通过向前迭代层进行区分以计算梯度，这些梯度将它们构成向前的计算和反向传播，并且需要平衡向后透过的前向计划的性能和计算成本。为了减轻此问题，我们建议通过Bellman定点方程进行区分，以使价值迭代网络及其变体向前和向后传递，这可以使恒定的向后成本（在计划中）和灵活的远期预算，并有助于扩展到大型任务。我们研究了VIN及其变体的拟议隐式版本的收敛稳定性，可扩展性和效率，并在一系列计划任务上展示了它们的优势：在配置空间和工作区中，2D导航，视觉导航和2DOF操纵。

Differentiable planning promises end-to-end differentiability and adaptivity. However, an issue prevents it from scaling up to larger-scale problems: they need to differentiate through forward iteration layers to compute gradients, which couples forward computation and backpropagation, and needs to balance forward planner performance and computational cost of the backward pass. To alleviate this issue, we propose to differentiate through the Bellman fixed-point equation to decouple forward and backward passes for Value Iteration Network and its variants, which enables constant backward cost (in planning horizon) and flexible forward budget and helps scale up to large tasks. We study the convergence stability, scalability, and efficiency of the proposed implicit version of VIN and its variants and demonstrate their superiorities on a range of planning tasks: 2D navigation, visual navigation, and 2-DOF manipulation in configuration space and workspace.

下载PDF全文

下载文献需遵守相关版权规定

论文标题