反事实VQA：因果效应的语言偏见

论文标题

反事实VQA：因果效应的语言偏见

Counterfactual VQA: A Cause-Effect Look at Language Bias

论文作者

Niu, Yulei, Tang, Kaihua, Zhang, Hanwang, Lu, Zhiwu, Hua, Xian-Sheng, Wen, Ji-Rong

论文摘要

VQA模型可能倾向于依靠语言偏见作为捷径，因此无法充分从视觉和语言学习多模式知识。最新的辩论方法提议在推论过程中排除先前的语言。但是，他们无法将“好”语言上下文和“坏语言”偏见从整体上删除。在本文中，我们研究了如何减轻VQA中语言偏见。受因果影响的促进，我们提出了一个新颖的反事实推理框架，这使我们能够将语言偏见作为问题对答案的直接因果效应，并通过从总因果效应中减去直接语言效应来减少语言偏见。实验表明，我们提出的反事实推理框架1）通常是各种VQA骨架和融合策略，2）在语言偏见敏感的VQA-CP数据集上实现竞争性能，而在平衡的VQA V2 V2数据集中则不得而无需任何增强数据。该代码可在https://github.com/yuleiniu/cfvqa上找到。

VQA models may tend to rely on language bias as a shortcut and thus fail to sufficiently learn the multi-modal knowledge from both vision and language. Recent debiasing methods proposed to exclude the language prior during inference. However, they fail to disentangle the "good" language context and "bad" language bias from the whole. In this paper, we investigate how to mitigate language bias in VQA. Motivated by causal effects, we proposed a novel counterfactual inference framework, which enables us to capture the language bias as the direct causal effect of questions on answers and reduce the language bias by subtracting the direct language effect from the total causal effect. Experiments demonstrate that our proposed counterfactual inference framework 1) is general to various VQA backbones and fusion strategies, 2) achieves competitive performance on the language-bias sensitive VQA-CP dataset while performs robustly on the balanced VQA v2 dataset without any augmented data. The code is available at https://github.com/yuleiniu/cfvqa.

下载PDF全文

下载文献需遵守相关版权规定

论文标题