如何使用几个样本微调模型：更新，数据扩展和测试时间增加

论文标题

如何使用几个样本微调模型：更新，数据扩展和测试时间增加

How to Fine-tune Models with Few Samples: Update, Data Augmentation, and Test-time Augmentation

论文作者

Kim, Yujin, Oh, Jaehoon, Kim, Sungnyun, Yun, Se-Young

论文摘要

最近的大多数几次学习（FSL）算法中的大多数都是基于转移学习，其中模型是使用大量源数据进行预训练的，并且使用少量目标数据对预训练的模型进行了微调。在基于转移学习的FSL中，已广泛研究了复杂的预训练方法以进行通用表示。因此，将通用表示制度用于下游任务已经变得越来越重要，但是关于FSL中微调的研究很少。在本文中，我们关注如何从三个角度将预训练的模型转移到下游任务的几个：更新，数据增强和测试时间扩展。首先，我们比较了两种流行的更新方法，即完整的微调（即更新整个网络，ft）和线性探测（即仅更新线性分类器，LP）。我们发现，LP比很少的样品更好，而随着培训样本的增加，FT的表现优于LP。接下来，我们表明数据增强不能保证几乎没有射击性能，并根据增强强度研究数据增强的有效性。最后，我们对更新的支持设置（即数据增强）以及预测设置（即测试时间扩大）的查询设置，考虑了支持 - 寻信分配变化，并提高了很少的射击性能。该代码可在https://github.com/kimyuji/updating_fsl上找到。

Most of the recent few-shot learning (FSL) algorithms are based on transfer learning, where a model is pre-trained using a large amount of source data, and the pre-trained model is fine-tuned using a small amount of target data. In transfer learning-based FSL, sophisticated pre-training methods have been widely studied for universal representation. Therefore, it has become more important to utilize the universal representation for downstream tasks, but there are few studies on fine-tuning in FSL. In this paper, we focus on how to transfer pre-trained models to few-shot downstream tasks from the three perspectives: update, data augmentation, and test-time augmentation. First, we compare the two popular update methods, full fine-tuning (i.e., updating the entire network, FT) and linear probing (i.e., updating only a linear classifier, LP). We find that LP is better than FT with extremely few samples, whereas FT outperforms LP as training samples increase. Next, we show that data augmentation cannot guarantee few-shot performance improvement and investigate the effectiveness of data augmentation based on the intensity of augmentation. Finally, we adopt augmentation to both a support set for update (i.e., data augmentation) as well as a query set for prediction (i.e., test-time augmentation), considering support-query distribution shifts, and improve few-shot performance. The code is available at https://github.com/kimyuji/updating_FSL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题