价值一致的代理商，自然和人造的语言盲点

论文标题

价值一致的代理商，自然和人造的语言盲点

The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial

论文作者

LaCroix, Travis

论文摘要

人工智能（AI）的价值分组问题询问我们如何确保人工系统的“价值”（即目标函数）与人类的价值一致。在本文中，我认为语言交流（自然语言）是稳健价值一致性的必要条件。我讨论了这一主张的真相将对试图确保AI系统价值一致的研究计划产生的后果；或者，更谨慎地设计强大的有益或道德人造代理。

The value-alignment problem for artificial intelligence (AI) asks how we can ensure that the 'values' (i.e., objective functions) of artificial systems are aligned with the values of humanity. In this paper, I argue that linguistic communication (natural language) is a necessary condition for robust value alignment. I discuss the consequences that the truth of this claim would have for research programmes that attempt to ensure value alignment for AI systems; or, more loftily, designing robustly beneficial or ethical artificial agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题