论文标题
深度卷积神经网络中目标指导关注的成本和收益
The Costs and Benefits of Goal-Directed Attention in Deep Convolutional Neural Networks
论文作者
论文摘要
People deploy top-down, goal-directed attention to accomplish tasks, such as finding lost keys.通过将视觉系统调整为相关信息源,对象识别可以变得更加有效(益处),并且对目标(潜在的成本)更加偏见。在分类模型中,通过选择性注意的动机,我们开发了一种目标定向的注意机制,可以处理自然主义(摄影)刺激。我们的注意机制可以纳入任何现有的深卷积神经网络(DCNNS)中。 DCNN中的处理阶段与腹侧视觉流有关。从这个角度来看,我们的注意机制融合了前额叶皮层(PFC)的自上而下的影响,以支持目标指导的行为。类似于分类模型中的注意力重量扭曲代表空间,我们向DCNN的中层介绍了一层注意力权重,从而扩大或减轻活动以进一步实现目标。我们使用照相刺激评估了注意力机制,改变了注意目标。我们发现,越来越多的目标指导的注意力具有好处(命中率提高)和成本(提高错误警报率)。在中等水平上,注意力提高了灵敏度(即增加$ d^\ prime $),只有适度的偏见增加了涉及标准图像,混合图像和自然对抗性图像的任务,以欺骗DCNN。这些结果表明,目标指导的注意力可以重新配置通用DCNN,以更好地适合当前的任务目标,就像PFC调节沿腹侧流的活动一样。除了更加简约和大脑保持一致之外,中级注意力方法的表现要好于转移学习的标准机器学习方法,即重新训练最终的网络层以适应新任务。
People deploy top-down, goal-directed attention to accomplish tasks, such as finding lost keys. By tuning the visual system to relevant information sources, object recognition can become more efficient (a benefit) and more biased toward the target (a potential cost). Motivated by selective attention in categorisation models, we developed a goal-directed attention mechanism that can process naturalistic (photographic) stimuli. Our attention mechanism can be incorporated into any existing deep convolutional neural network (DCNNs). The processing stages in DCNNs have been related to ventral visual stream. In that light, our attentional mechanism incorporates top-down influences from prefrontal cortex (PFC) to support goal-directed behaviour. Akin to how attention weights in categorisation models warp representational spaces, we introduce a layer of attention weights to the mid-level of a DCNN that amplify or attenuate activity to further a goal. We evaluated the attentional mechanism using photographic stimuli, varying the attentional target. We found that increasing goal-directed attention has benefits (increasing hit rates) and costs (increasing false alarm rates). At a moderate level, attention improves sensitivity (i.e., increases $d^\prime$) at only a moderate increase in bias for tasks involving standard images, blended images, and natural adversarial images chosen to fool DCNNs. These results suggest that goal-directed attention can reconfigure general-purpose DCNNs to better suit the current task goal, much like PFC modulates activity along the ventral stream. In addition to being more parsimonious and brain consistent, the mid-level attention approach performed better than a standard machine learning approach for transfer learning, namely retraining the final network layer to accommodate the new task.