从自我监督学习的上下文中的原因

论文标题

从自我监督学习的上下文中的原因

Reason from Context with Self-supervised Learning

论文作者

Liu, Xiao, Sikarwar, Ankur, Kreiman, Gabriel, Shi, Zenglin, Zhang, Mengmi

论文摘要

自我监督的学习（SSL）学会捕获可用于知识转移的歧视性视觉特征。为了更好地适应当前下游任务（例如对象识别和检测）的以对象为中心的性质，已经提出了各种方法来抑制上下文中的上下文偏见或从上下文中解开对象。然而，这些方法可能在需要从相关上下文（例如识别或推断出微小或模糊的对象）进行推理的情况下证明不足。作为SSL文献中的最初努力，我们研究了如何以及如何通过（a）提出一种具有上下文推理的外部记忆（SECO）的新的自我监督方法（a），以及（b）引入两个新的下游任务，提升范围和对象启动，解决问题，以解决上下文，以下情况下，解决了上下文的问题，解决了上下文，解决了其他问题。在这两项任务中，SECO的表现都超过了所有最新的SSL方法，其余量很大。我们的网络分析表明，SECO中提出的外部内存学会了存储先前的上下文知识，从而促进了升降机任务中的目标身份推断。此外，我们进行了心理物理学实验，并在对象启动数据集（HOP）中引入了人类基准。我们的结果表明，SECO表现出类似人类的行为。

Self-supervised learning (SSL) learns to capture discriminative visual features useful for knowledge transfers. To better accommodate the object-centric nature of current downstream tasks such as object recognition and detection, various methods have been proposed to suppress contextual biases or disentangle objects from contexts. Nevertheless, these methods may prove inadequate in situations where object identity needs to be reasoned from associated context, such as recognizing or inferring tiny or obscured objects. As an initial effort in the SSL literature, we investigate whether and how contextual associations can be enhanced for visual reasoning within SSL regimes, by (a) proposing a new Self-supervised method with external memories for Context Reasoning (SeCo), and (b) introducing two new downstream tasks, lift-the-flap and object priming, addressing the problems of "what" and "where" in context reasoning. In both tasks, SeCo outperformed all state-of-the-art (SOTA) SSL methods by a significant margin. Our network analysis revealed that the proposed external memory in SeCo learns to store prior contextual knowledge, facilitating target identity inference in the lift-the-flap task. Moreover, we conducted psychophysics experiments and introduced a Human benchmark in Object Priming dataset (HOP). Our results demonstrate that SeCo exhibits human-like behaviors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题