论文标题
场景中的规格描述到缩写任务
Underspecification in Scene Description-to-Depiction Tasks
论文作者
论文摘要
关于隐性,歧义和规定的问题对于理解多模式图像+文本系统的任务有效性和道德问题至关重要,但迄今为止几乎没有关注。该位置纸绘制了一个概念框架来解决此差距,重点是生成图像描述场景描述场景的系统。在此过程中,我们说明了文本和图像如何以不同的方式传达含义。我们概述了有关文本和视觉歧义的一系列核心挑战,以及可能通过模棱两可和指定的元素扩大的风险。我们提出并讨论解决这些挑战的策略,包括产生视觉模棱两可的图像以及产生一组不同的图像。
Questions regarding implicitness, ambiguity and underspecification are crucial for understanding the task validity and ethical concerns of multimodal image+text systems, yet have received little attention to date. This position paper maps out a conceptual framework to address this gap, focusing on systems which generate images depicting scenes from scene descriptions. In doing so, we account for how texts and images convey meaning differently. We outline a set of core challenges concerning textual and visual ambiguity, as well as risks that may be amplified by ambiguous and underspecified elements. We propose and discuss strategies for addressing these challenges, including generating visually ambiguous images, and generating a set of diverse images.