CHAI：聊天机器人AI，用于以任务为导向的对话，并通过离线增强学习

论文标题

CHAI：聊天机器人AI，用于以任务为导向的对话，并通过离线增强学习

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

论文作者

Verma, Siddharth, Fu, Justin, Yang, Mengjiao, Levine, Sergey

论文摘要

通常，对话代理的自然语言的产生可能被视为一个统计学习问题：确定人提供的数据中的模式，并以相似的统计属性产生适当的响应。但是，对话也可以被视为一个目标的过程，说话者试图完成一项特定的任务。强化学习（RL）算法专门用于解决此类目标问题，但是通过在人类对话中进行反复学习的RL的最直接方法是昂贵的。在本文中，我们研究了如何使用从人说的静态数据集中完全使用离线增强学习来训练对话代理。我们的实验表明，最近开发的离线RL方法可以与语言模型相结合，以产生更好地完成任务目标的现实对话代理。

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties. However, dialogue can also be regarded as a goal directed process, where speakers attempt to accomplish a specific task. Reinforcement learning (RL) algorithms are designed specifically for solving such goal-directed problems, but the most direct way to apply RL -- through trial-and-error learning in human conversations, -- is costly. In this paper, we study how offline reinforcement learning can instead be used to train dialogue agents entirely using static datasets collected from human speakers. Our experiments show that recently developed offline RL methods can be combined with language models to yield realistic dialogue agents that better accomplish task goals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题