论文标题
MM-Align:学习最佳基于运输的对准动力学,以快速准确地推断缺失的模态序列
MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences
论文作者
论文摘要
现有的多模式任务主要是在完整的输入方式设置下进行定位,即,在培训和测试集中,每种模式都是完整的或完全缺少的。但是,随机丢失的情况仍未得到充实。在本文中,我们提出了一种名为MM-Align的新颖方法,以解决缺失模式推断问题。具体而言,我们提出了1)基于间接缺失数据推出的最佳传输理论(OT)的对齐动力学学习模块; 2)一种降级培训算法,以同时增强插补结果和骨干网络性能。与先前致力于重建缺失输入的方法相比,MM-Align学会了捕获和模仿模态序列之间的比对动力学。涵盖两个多模式任务的三个数据集上的综合实验的结果表明,我们的方法可以执行更准确,更快的推断,并在各种缺失条件下缓解过度拟合。
Existing multimodal tasks mostly target at the complete input modality setting, i.e., each modality is either complete or completely missing in both training and test sets. However, the randomly missing situations have still been underexplored. In this paper, we present a novel approach named MM-Align to address the missing-modality inference problem. Concretely, we propose 1) an alignment dynamics learning module based on the theory of optimal transport (OT) for indirect missing data imputation; 2) a denoising training algorithm to simultaneously enhance the imputation results and backbone network performance. Compared with previous methods which devote to reconstructing the missing inputs, MM-Align learns to capture and imitate the alignment dynamics between modality sequences. Results of comprehensive experiments on three datasets covering two multimodal tasks empirically demonstrate that our method can perform more accurate and faster inference and relieve overfitting under various missing conditions.