论文标题

增强学习语义新颖性的基础模型

Foundation Models for Semantic Novelty in Reinforcement Learning

论文作者

Gupta, Tarun, Karkus, Peter, Che, Tong, Xu, Danfei, Pavone, Marco

论文摘要

有效地探索环境是加强学习(RL)的关键挑战。我们通过定义基于基础模型的新颖固有奖励(例如对比度语言图像(剪辑))来应对这一挑战,该奖励可以编码大量无独立的域语义视觉视觉知识。具体而言,我们的固有奖励是根据预先训练的剪辑嵌入来定义的,而无需对目标RL任务进行任何微调或学习。我们证明,基于夹子的内在奖励可以推动对语义上有意义的状态进行探索,并且在挑战稀疏奖励的程序生成环境中胜过最先进的方法。

Effectively exploring the environment is a key challenge in reinforcement learning (RL). We address this challenge by defining a novel intrinsic reward based on a foundation model, such as contrastive language image pretraining (CLIP), which can encode a wealth of domain-independent semantic visual-language knowledge about the world. Specifically, our intrinsic reward is defined based on pre-trained CLIP embeddings without any fine-tuning or learning on the target RL task. We demonstrate that CLIP-based intrinsic rewards can drive exploration towards semantically meaningful states and outperform state-of-the-art methods in challenging sparse-reward procedurally-generated environments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源