论文标题
价值一致的代理商,自然和人造的语言盲点
The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial
论文作者
论文摘要
人工智能(AI)的价值分组问题询问我们如何确保人工系统的“价值”(即目标函数)与人类的价值一致。在本文中,我认为语言交流(自然语言)是稳健价值一致性的必要条件。我讨论了这一主张的真相将对试图确保AI系统价值一致的研究计划产生的后果;或者,更谨慎地设计强大的有益或道德人造代理。
The value-alignment problem for artificial intelligence (AI) asks how we can ensure that the 'values' (i.e., objective functions) of artificial systems are aligned with the values of humanity. In this paper, I argue that linguistic communication (natural language) is a necessary condition for robust value alignment. I discuss the consequences that the truth of this claim would have for research programmes that attempt to ensure value alignment for AI systems; or, more loftily, designing robustly beneficial or ethical artificial agents.