通过动态比例路由进行显着对象检测

论文标题

通过动态比例路由进行显着对象检测

Salient Object Detection via Dynamic Scale Routing

论文作者

Wu, Zhenyu, Li, Shuai, Chen, Chenglizhao, Qin, Hong, Hao, Aimin

论文摘要

显着对象检测（SOD）的最新研究进展可能在很大程度上归因于深度学习技术赋予的不断增长的多尺度特征表示。现有的SOD Deep Models通过现成的编码提取多尺度功能，并通过各种精致的解码器巧妙地组合它们。但是，此常用线中的内核大小通常是“固定的”。在我们的新实验中，我们观察到，在包含微小的物体的情况下，小尺寸的内核是可取的。相比之下，大核大小可以表现出更大的图像，具有较大的显着物体。受这一观察的启发，我们主张本文“动态”规模路由（作为全新的想法）。它将导致通用插件，该插件可以直接适合现有功能主干。本文的关键技术创新是两倍。首先，我们提出了动态金字塔卷积（DPCONV），而不是使用固定核大小的香草卷积，而是动态选择最合适的内核大小W.R.T.给定输入。其次，我们提供了自适应双向解码器设计，以适应基于DPCONV的编码器。最重要的亮点是它在特征量表和动态集合之间的路由能力，从而使推理过程量表了解。结果，本文继续提高当前的SOTA性能。代码和数据集都在https://github.com/wuzhenyubuaa/dpnet上公开可用。

Recent research advances in salient object detection (SOD) could largely be attributed to ever-stronger multi-scale feature representation empowered by the deep learning technologies. The existing SOD deep models extract multi-scale features via the off-the-shelf encoders and combine them smartly via various delicate decoders. However, the kernel sizes in this commonly-used thread are usually "fixed". In our new experiments, we have observed that kernels of small size are preferable in scenarios containing tiny salient objects. In contrast, large kernel sizes could perform better for images with large salient objects. Inspired by this observation, we advocate the "dynamic" scale routing (as a brand-new idea) in this paper. It will result in a generic plug-in that could directly fit the existing feature backbone. This paper's key technical innovations are two-fold. First, instead of using the vanilla convolution with fixed kernel sizes for the encoder design, we propose the dynamic pyramid convolution (DPConv), which dynamically selects the best-suited kernel sizes w.r.t. the given input. Second, we provide a self-adaptive bidirectional decoder design to accommodate the DPConv-based encoder best. The most significant highlight is its capability of routing between feature scales and their dynamic collection, making the inference process scale-aware. As a result, this paper continues to enhance the current SOTA performance. Both the code and dataset are publicly available at https://github.com/wuzhenyubuaa/DPNet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题