部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Geometric Reinforcement Learning For Robotic Manipulation

论文作者

Alhousani, Naseem, Saveriano, Matteo, Sevinc, Ibrahim, Abdulkuddus, Talha, Kose, Hatice, Abu-Dakka, Fares J.

论文摘要

增强学习（RL）是一种流行的技术，它允许代理商在与动态环境互动时通过反复试验学习。传统的增强学习方法（RL）方法在学习和预测欧几里得机器人操纵技巧（例如位置，速度和力量）方面已经成功。但是，在机器人技术中，常见的是遇到非欧国人数据，例如方向或僵硬，并且无法解释其几何性质会对学习准确性和表现产生负面影响。在本文中，为了应对这一挑战，我们提出了一个利用Riemannian几何形状的RL新颖框架，我们称之为几何增强学习（G-RL），以使代理商能够使用非欧盟数据来学习机器人操纵技能。具体而言，G-RL以两种方式利用了切线空间：参数化的切线空间和局部切线空间，用于映射到Notuclidean歧管。该策略是在参数化切线空间中学习的，该空间在整个培训过程中保持恒定。然后，该策略通过平行运输转移到当地的切线空间，并投影到非欧几里得歧管上。局部切线空间会随着时间而变化，以保留在当前歧管点的附近，从而减少了近似误差。因此，通过在传统的RL管道中引入几何扎根的预处理和后处理步骤，我们的G-RL框架可以实现多种为欧几里得空间设计的无模型算法，即可从非欧盟数据中学习，而无需进行任何修改。在模拟和真实机器人中获得的实验结果支持我们的假设，即G-RL比近似非欧盟数据更准确并收敛到更好的解决方案。

Reinforcement learning (RL) is a popular technique that allows an agent to learn by trial and error while interacting with a dynamic environment. The traditional Reinforcement Learning (RL) approach has been successful in learning and predicting Euclidean robotic manipulation skills such as positions, velocities, and forces. However, in robotics, it is common to encounter non-Euclidean data such as orientation or stiffness, and failing to account for their geometric nature can negatively impact learning accuracy and performance. In this paper, to address this challenge, we propose a novel framework for RL that leverages Riemannian geometry, which we call Geometric Reinforcement Learning (G-RL), to enable agents to learn robotic manipulation skills with non-Euclidean data. Specifically, G-RL utilizes the tangent space in two ways: a tangent space for parameterization and a local tangent space for mapping to a nonEuclidean manifold. The policy is learned in the parameterization tangent space, which remains constant throughout the training. The policy is then transferred to the local tangent space via parallel transport and projected onto the non-Euclidean manifold. The local tangent space changes over time to remain within the neighborhood of the current manifold point, reducing the approximation error. Therefore, by introducing a geometrically grounded pre- and post-processing step into the traditional RL pipeline, our G-RL framework enables several model-free algorithms designed for Euclidean space to learn from non-Euclidean data without modifications. Experimental results, obtained both in simulation and on a real robot, support our hypothesis that G-RL is more accurate and converges to a better solution than approximating non-Euclidean data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题