MAML和Anil可以学习表现

论文标题

MAML和Anil可以学习表现

MAML and ANIL Provably Learn Representations

论文作者

Collins, Liam, Mokhtari, Aryan, Oh, Sewoong, Shakkottai, Sanjay

论文摘要

最近的经验证据使传统的智慧相信基于梯度的元学习（GBML）方法在几乎没有学习的过程中表现良好，因为他们学习了跨任务共享的表达数据表示。但是，从理论角度来看，GBML的机制在很大程度上仍然是神秘的。在本文中，我们证明了两种众所周知的GBML方法MAML和ANIL及其一阶近似值能够在一组给定的任务中学习共同表示。具体而言，在众所周知的多任务线性表示设置中，他们能够以指数速度的快速速率恢复地面真相表示。此外，我们的分析阐明了导致MAML和Anil恢复基本表示的驱动力是它们适应了模型的最终层，这利用了基本的任务多样性，以在所有感兴趣的方向上改善表示的表示。据我们所知，这些是第一个表明MAML和/或Anil学习表现力的结果，并严格解释它们为什么这样做的结果。

Recent empirical evidence has driven conventional wisdom to believe that gradient-based meta-learning (GBML) methods perform well at few-shot learning because they learn an expressive data representation that is shared across tasks. However, the mechanics of GBML have remained largely mysterious from a theoretical perspective. In this paper, we prove that two well-known GBML methods, MAML and ANIL, as well as their first-order approximations, are capable of learning common representation among a set of given tasks. Specifically, in the well-known multi-task linear representation learning setting, they are able to recover the ground-truth representation at an exponentially fast rate. Moreover, our analysis illuminates that the driving force causing MAML and ANIL to recover the underlying representation is that they adapt the final layer of their model, which harnesses the underlying task diversity to improve the representation in all directions of interest. To the best of our knowledge, these are the first results to show that MAML and/or ANIL learn expressive representations and to rigorously explain why they do so.

下载PDF全文

下载文献需遵守相关版权规定

论文标题