场景意识提示多模式对话理解和发电

论文标题

场景意识提示多模式对话理解和发电

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

论文作者

Li, Bin, Weng, Yixuan, Ma, Ziyu, Sun, Bin, Li, Shutao

论文摘要

本文介绍了Lingjing团队在NLPCC-2022-Shared-Task-4多模式对话理解和发电（MDUG）中的实验方案。 MDUG任务可以分为两个阶段：多模式上下文理解和响应生成。为了充分利用视觉信息以获得场景的理解和对话的生成，我们提出了MDUG任务的场景感知提示。具体而言，我们利用多任务策略共同建模场景和会话多模式的理解。采用视觉标题来了解场景信息，而基于场景和会话感知标签的固定类型的模板提示则用于进一步改善对话生成性能。广泛的实验结果表明，与其他竞争方法相比，该提出的方法已经达到了最先进的（SOTA）性能，在此MDUG竞争中，我们在所有三个子任务中排名1-ST。

This paper introduces the schemes of Team LingJing's experiments in NLPCC-2022-Shared-Task-4 Multi-modal Dialogue Understanding and Generation (MDUG). The MDUG task can be divided into two phases: multi-modal context understanding and response generation. To fully leverage the visual information for both scene understanding and dialogue generation, we propose the scene-aware prompt for the MDUG task. Specifically, we utilize the multi-tasking strategy for jointly modelling the scene- and session- multi-modal understanding. The visual captions are adopted to aware the scene information, while the fixed-type templated prompt based on the scene- and session-aware labels are used to further improve the dialogue generation performance. Extensive experimental results show that the proposed method has achieved state-of-the-art (SOTA) performance compared with other competitive methods, where we rank the 1-st in all three subtasks in this MDUG competition.

下载PDF全文

下载文献需遵守相关版权规定

论文标题