Blenderbot 3：一种部署的对话代理，不断学习以负责任地参与

论文标题

Blenderbot 3：一种部署的对话代理，不断学习以负责任地参与

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

论文作者

Shuster, Kurt, Xu, Jing, Komeili, Mojtaba, Ju, Da, Smith, Eric Michael, Roller, Stephen, Ung, Megan, Chen, Moya, Arora, Kushal, Lane, Joshua, Behrooz, Morteza, Ngan, William, Poff, Spencer, Goyal, Naman, Szlam, Arthur, Boureau, Y-Lan, Kambadur, Melanie, Weston, Jason

论文摘要

我们提出了Blenderbot 3，这是一个175B参数对话模型，能够通过访问Internet和长期内存进行开放域对话，并接受了大量用户定义的任务的培训。我们发布了模型权重和代码，还将模型部署在公共网页上，以与有机用户进行交互。该技术报告描述了该模型的构建方式（建筑，模型和培训计划）以及其部署的细节，包括安全机制。人类评估表明，它优于现有的开放域对话代理，包括其前身（Roller等，2021； Komeili等，2022）。最后，我们使用部署收集的数据详细介绍了持续学习的计划，该计划也将公开发布。因此，该研究计划的目的是使社区能够研究通过互动学习的不断改善的负责任的代理商。

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks. We release both the model weights and code, and have also deployed the model on a public web page to interact with organic users. This technical report describes how the model was built (architecture, model and training scheme), and details of its deployment, including safety mechanisms. Human evaluations show its superiority to existing open-domain dialogue agents, including its predecessors (Roller et al., 2021; Komeili et al., 2022). Finally, we detail our plan for continual learning using the data collected from deployment, which will also be publicly released. The goal of this research program is thus to enable the community to study ever-improving responsible agents that learn through interaction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题