论文标题

解散的非本地神经网络

Disentangled Non-Local Neural Networks

论文作者

Yin, Minghao, Yao, Zhuliang, Cao, Yue, Li, Xiu, Zhang, Zheng, Lin, Stephen, Hu, Han

论文摘要

非本地块是增强常规卷积神经网络的上下文建模能力的流行模块。本文首先研究了非本地块深度,我们发现它的注意力计算可以分为两个术语,这是一个美白的成对术语,该术语占了两个像素和代表每个像素显着性的一单像素之间的关系。我们还观察到,单独训练的两个术语倾向于对不同的视觉线索进行建模,例如白色的成对术语学习了区域内关系,而单一术语则学习显着边界。但是,这两个术语紧密耦合在非本地块中,这阻碍了每个术语的学习。根据这些发现,我们提出了分离的非本地块,其中两个术语被解耦以促进两个术语的学习。我们证明了脱钩设计对各种任务的有效性,例如有关城市景观,ADE20K和Pascal环境的语义细分,可可的对象检测以及动力学的动作识别。

The non-local block is a popular module for strengthening the context modeling ability of a regular convolutional neural network. This paper first studies the non-local block in depth, where we find that its attention computation can be split into two terms, a whitened pairwise term accounting for the relationship between two pixels and a unary term representing the saliency of every pixel. We also observe that the two terms trained alone tend to model different visual clues, e.g. the whitened pairwise term learns within-region relationships while the unary term learns salient boundaries. However, the two terms are tightly coupled in the non-local block, which hinders the learning of each. Based on these findings, we present the disentangled non-local block, where the two terms are decoupled to facilitate learning for both terms. We demonstrate the effectiveness of the decoupled design on various tasks, such as semantic segmentation on Cityscapes, ADE20K and PASCAL Context, object detection on COCO, and action recognition on Kinetics.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源