论文标题
在某些自适应控制问题中,有限地平线的最佳探索策略最小化最小化
Optimal exploration strategies for finite horizon regret minimization in some adaptive control problems
论文作者
论文摘要
在这项工作中,我们考虑了自适应最小差异和线性二次控制问题中遗憾最小化的问题。遗憾的最小化已经在文献中针对两种自适应控制问题进行了广泛的研究。这些作品中的大多数给出了渐近制度中最佳遗憾率的结果。在最小差异情况下,遗憾的最佳渐近率是$ \ log(t)$,无需任何其他外部激发就可以达到。相反,对于大多数自适应线性二次问题,有必要添加外部激发,以获得$ \ sqrt {t} $的最佳渐近率。在本文中,我们实际上将从一项理论研究中表明,在模拟中,当预先指定的控制范围被预先指定的情况下,可以通过不外部激发或立即称为新的探索类型来获得较低的遗憾。
In this work, we consider the problem of regret minimization in adaptive minimum variance and linear quadratic control problems. Regret minimization has been extensively studied in the literature for both types of adaptive control problems. Most of these works give results of the optimal rate of the regret in the asymptotic regime. In the minimum variance case, the optimal asymptotic rate for the regret is $\log(T)$ which can be reached without any additional external excitation. On the contrary, for most adaptive linear quadratic problems, it is necessary to add an external excitation in order to get the optimal asymptotic rate of $\sqrt{T}$. In this paper, we will actually show from an a theoretical study, as well as, in simulations that when the control horizon is pre-specified a lower regret can be obtained with either no external excitation or a new exploration type termed immediate.