潜在的瓶颈细胞神经过程

论文标题

潜在的瓶颈细胞神经过程

Latent Bottlenecked Attentive Neural Processes

论文作者

Feng, Leo, Hajimirsadeghi, Hossein, Bengio, Yoshua, Ahmed, Mohamed Osama

论文摘要

神经过程（NP）是元学习中的流行方法，可以通过在上下文数据集上进行调节来估计目标数据点上的预测不确定性。先前的最新方法变压器神经过程（TNP）实现了强大的性能，但需要相对于上下文数据点的数量进行二次计算，从而大大限制了其可扩展性。相反，现有的亚季节NP变体的性能明显比TNP的差异明显差。解决这个问题，我们提出了潜在的瓶颈细胞神经过程（LBANPS），这是一种新的计算有效的子季节NP变体，它具有与上下文数据点数的数量相关的查询计算复杂性。该模型将上下文数据集编码为执行自我注意的恒定数量的潜在向量。进行预测时，该模型通过潜在向量上的多个跨注意机制从上下文数据集检索高阶信息。我们从经验上表明，LBANPS与最新的元回归，图像完成和上下文多臂匪徒实现了竞争。我们证明，LBANP可以根据潜在向量的数量来权衡计算成本和性能。最后，我们显示LBANP可以扩展到现有的基于注意力的NP变体到较大的数据集设置。

Neural Processes (NPs) are popular methods in meta-learning that can estimate predictive uncertainty on target datapoints by conditioning on a context dataset. Previous state-of-the-art method Transformer Neural Processes (TNPs) achieve strong performance but require quadratic computation with respect to the number of context datapoints, significantly limiting its scalability. Conversely, existing sub-quadratic NP variants perform significantly worse than that of TNPs. Tackling this issue, we propose Latent Bottlenecked Attentive Neural Processes (LBANPs), a new computationally efficient sub-quadratic NP variant, that has a querying computational complexity independent of the number of context datapoints. The model encodes the context dataset into a constant number of latent vectors on which self-attention is performed. When making predictions, the model retrieves higher-order information from the context dataset via multiple cross-attention mechanisms on the latent vectors. We empirically show that LBANPs achieve results competitive with the state-of-the-art on meta-regression, image completion, and contextual multi-armed bandits. We demonstrate that LBANPs can trade-off the computational cost and performance according to the number of latent vectors. Finally, we show LBANPs can scale beyond existing attention-based NP variants to larger dataset settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题