论文标题
进行测试时间培训的混合
Mixup for Test-Time Training
论文作者
论文摘要
测试时间培训提供了一种解决域移位问题的新方法。在其框架中,在训练阶段和测试阶段之间插入了测试时间训练阶段。在测试时间训练阶段,通常会使用测试样本更新模型的一部分。然后,更新的模型将在测试阶段使用。但是,利用测试样品进行测试时间培训有一些局限性。首先,它将导致测试时间过程过度拟合,从而损害了主要任务的性能。此外,在不更改其他零件的情况下更新模型的一部分将引起不匹配的问题。因此,很难在主要任务上表现更好。为了缓解上述问题,我们建议在测试时间训练(MIXTTT)中使用混合,该培训控制模型参数的变化以及完成测试时间过程。从理论上讲,我们在减轻更新部分的不匹配问题和主要任务的静态部分方面做出了贡献,这是测试时间培训的特定正规化效果。 MixTTT可以用作基于一般测试时间培训方法的附加模块,以进一步提高其性能。实验结果显示了我们方法的有效性。
Test-time training provides a new approach solving the problem of domain shift. In its framework, a test-time training phase is inserted between training phase and test phase. During test-time training phase, usually parts of the model are updated with test sample(s). Then the updated model will be used in the test phase. However, utilizing test samples for test-time training has some limitations. Firstly, it will lead to overfitting to the test-time procedure thus hurt the performance on the main task. Besides, updating part of the model without changing other parts will induce a mismatch problem. Thus it is hard to perform better on the main task. To relieve above problems, we propose to use mixup in test-time training (MixTTT) which controls the change of model's parameters as well as completing the test-time procedure. We theoretically show its contribution in alleviating the mismatch problem of updated part and static part for the main task as a specific regularization effect for test-time training. MixTTT can be used as an add-on module in general test-time training based methods to further improve their performance. Experimental results show the effectiveness of our method.