论文标题
对数遗憾的在线学习卡尔曼过滤器
Online Learning of the Kalman Filter with Logarithmic Regret
论文作者
论文摘要
在本文中,我们考虑了预测由未知的,部分观察到的线性系统在线产生的观察结果的问题,该系统是由随机噪声驱动的。对于此类系统,在均等意义上,最佳预测指标是著名的Kalman滤波器,可以在已知系统模型时明确计算。当系统模型未知时,我们必须学习如何基于有限数据在线预测观察结果,而对卡尔曼过滤器的预测,可能会遭受非零的遗憾。我们表明,有可能对$ \ mathrm {poly} \ log(n)$的订单感到遗憾,其中$ n $是收集的观测值的数量。我们的工作是第一个为广泛使用的卡尔曼过滤器提供对数后悔保证的工作。这是使用在线最小二乘算法实现的,该算法利用了未来观察和过去观察结果之间的近似线性关系。遗憾的分析基于卡尔曼过滤器的稳定性,最新的系统识别样本分析统计工具以及用于分析时间序列最小二乘算法的经典结果。在未知噪声统计的情况下,我们的遗憾分析也可以用于对隐藏状态的状态预测,但已知的状态空间基础。基本的技术贡献是,我们的界限即使对于非爆炸系统的类别,其中包括一类边缘稳定的系统,这对于在随机噪声下在线预测的情况下是一个开放的问题。
In this paper, we consider the problem of predicting observations generated online by an unknown, partially observed linear system, which is driven by stochastic noise. For such systems the optimal predictor in the mean square sense is the celebrated Kalman filter, which can be explicitly computed when the system model is known. When the system model is unknown, we have to learn how to predict observations online based on finite data, suffering possibly a non-zero regret with respect to the Kalman filter's prediction. We show that it is possible to achieve a regret of the order of $\mathrm{poly}\log(N)$ with high probability, where $N$ is the number of observations collected. Our work is the first to provide logarithmic regret guarantees for the widely used Kalman filter. This is achieved using an online least-squares algorithm, which exploits the approximately linear relation between future observations and past observations. The regret analysis is based on the stability properties of the Kalman filter, recent statistical tools for finite sample analysis of system identification, and classical results for the analysis of least-squares algorithms for time series. Our regret analysis can also be applied for state prediction of the hidden state, in the case of unknown noise statistics but known state-space basis. A fundamental technical contribution is that our bounds hold even for the class of non-explosive systems, which includes the class of marginally stable systems, which was an open problem for the case of online prediction under stochastic noise.