论文标题

大坝:在视觉对话中产生详细和非重复响应的审议,放弃和记忆网络

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

论文作者

Jiang, Xiaoze, Yu, Jing, Sun, Yajing, Qin, Zengchang, Zhu, Zihao, Hu, Yue, Wu, Qi

论文摘要

视觉对话任务要求代理商与人类就图像进行对话。产生详细和非重复反应的能力对于代理进行类似人类的对话至关重要。在本文中,我们提出了一种新颖的生成解码体系结构来产生高质量的响应,该响应从解码整个编码语义的解码转向了提倡透明度和灵活性的设计。在此架构中,单词生成分解为一系列基于注意力的信息选择步骤,该步骤由新颖的经常审议,放弃和记忆(DAM)模块执行。每个大坝模块都执行从编码器捕获的响应级语义和专门为生成每个单词选择的单词级别语义的自适应组合。因此,响应包含更详细和非重复的描述,同时保持语义准确性。此外,DAM可以灵活地与现有的视觉对话编码器合作,并通过约束DAM中的信息选择模式来适应编码器结构。我们将大坝应用于三个典型的编码器,并验证Visdial V1.0数据集中的性能。实验结果表明,所提出的模型通过高质量的响应实现了新的最新性能。该代码可在https://github.com/jxze/dam上找到。

Visual Dialogue task requires an agent to be engaged in a conversation with human about an image. The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation. In this paper, we propose a novel generative decoding architecture to generate high-quality responses, which moves away from decoding the whole encoded semantics towards the design that advocates both transparency and flexibility. In this architecture, word generation is decomposed into a series of attention-based information selection steps, performed by the novel recurrent Deliberation, Abandon and Memory (DAM) module. Each DAM module performs an adaptive combination of the response-level semantics captured from the encoder and the word-level semantics specifically selected for generating each word. Therefore, the responses contain more detailed and non-repetitive descriptions while maintaining the semantic accuracy. Furthermore, DAM is flexible to cooperate with existing visual dialogue encoders and adaptive to the encoder structures by constraining the information selection mode in DAM. We apply DAM to three typical encoders and verify the performance on the VisDial v1.0 dataset. Experimental results show that the proposed models achieve new state-of-the-art performance with high-quality responses. The code is available at https://github.com/JXZe/DAM.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源