论文标题
LQR带有跟踪:零订单方法及其全局收敛
LQR with Tracking: A Zeroth-order Approach and Its Global Convergence
论文作者
论文摘要
在对无模型二次调节器(LQR)问题的无模型方法的理论理解上取得了很大的进步。当目标是将国家接近零目标的目标推动州时,已经非常关注特殊情况。在这项工作中,我们考虑了允许目标任意的一般情况,我们将其称为LQR跟踪问题。我们研究了此问题的优化格局,并表明与零目标LQR问题相似,LQR跟踪问题还满足了梯度的优势和局部平滑度的特性。这使我们能够开发一种实现全球融合的零订单策略梯度算法。我们通过线性系统上的数值模拟来支持我们的参数。
There has been substantial recent progress on the theoretical understanding of model-free approaches to Linear Quadratic Regulator (LQR) problems. Much attention has been devoted to the special case when the goal is to drive the state close to a zero target. In this work, we consider the general case where the target is allowed to be arbitrary, which we refer to as the LQR tracking problem. We study the optimization landscape of this problem, and show that similar to the zero-target LQR problem, the LQR tracking problem also satisfies gradient dominance and local smoothness properties. This allows us to develop a zeroth-order policy gradient algorithm that achieves global convergence. We support our arguments with numerical simulations on a linear system.