基于不确定性的基于不确定性的元强化学习，用于鲁塔跟踪

论文标题

基于不确定性的基于不确定性的元强化学习，用于鲁塔跟踪

Uncertainty-based Meta-Reinforcement Learning for Robust Radar Tracking

论文作者

Ott, Julius, Servadei, Lorenzo, Mauro, Gianfranco, Stadelmayer, Thomas, Santra, Avik, Wille, Robert

论文摘要

如今，深度学习（DL）方法经常克服传统信号处理方法的局限性。然而，DL方法几乎没有应用于现实生活中。这主要是由于训练和测试数据之间的鲁棒性和分配变化有限。为此，最近的工作提出了提高其可靠性的不确定性机制。此外，元学习旨在提高DL模型的概括能力。通过利用这一点，本文提出了一种基于不确定性的元强化学习（META-RL）方法，并通过分布（OOD）检测。呈现的方法在看不见的环境中执行给定的任务，并提供有关其复杂性的信息。这是通过确定估计奖励的一阶和二阶统计数据来完成的。使用有关其复杂性的信息，提出的算法能够指出跟踪何时可靠。为了评估提出的方法，我们将其基准在雷达跟踪数据集上进行基准测试。在那里，我们表明，我们的方法在峰值性能的看不见的跟踪方案上优于相关的元素方法，而基线的峰值表现则比35％，而F1得分为72％。这表明我们的方法对环境变化具有鲁棒性，并可靠地检测到OOD方案。

Nowadays, Deep Learning (DL) methods often overcome the limitations of traditional signal processing approaches. Nevertheless, DL methods are barely applied in real-life applications. This is mainly due to limited robustness and distributional shift between training and test data. To this end, recent work has proposed uncertainty mechanisms to increase their reliability. Besides, meta-learning aims at improving the generalization capability of DL models. By taking advantage of that, this paper proposes an uncertainty-based Meta-Reinforcement Learning (Meta-RL) approach with Out-of-Distribution (OOD) detection. The presented method performs a given task in unseen environments and provides information about its complexity. This is done by determining first and second-order statistics on the estimated reward. Using information about its complexity, the proposed algorithm is able to point out when tracking is reliable. To evaluate the proposed method, we benchmark it on a radar-tracking dataset. There, we show that our method outperforms related Meta-RL approaches on unseen tracking scenarios in peak performance by 16% and the baseline by 35% while detecting OOD data with an F1-Score of 72%. This shows that our method is robust to environmental changes and reliably detects OOD scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题