FCSN：通过学习医学图像中对象的傅立叶系数，全球上下文意识到细分

论文标题

FCSN：通过学习医学图像中对象的傅立叶系数，全球上下文意识到细分

FCSN: Global Context Aware Segmentation by Learning the Fourier Coefficients of Objects in Medical Images

论文作者

Jeon, Young Seok, Yang, Hongfei, Feng, Mengling

论文摘要

编码器模型是用于医疗图像分割的常用深神网络（DNN）模型。传统的编码器模型使像素的预测重点放在像素周围的本地模式上。这使得对保留对象的形状和拓扑的细分进行分割变得具有挑战性，这通常需要了解对象的全局环境。在这项工作中，我们提出了一个傅立叶系数分割网络〜（FCSN），这是一个基于DNN的新型模型，该模型通过学习对象掩模的复杂傅立叶系数来分割对象。通过在整个轮廓上集成来计算傅立叶系数。因此，为了使我们的模型对系数进行精确估计，该模型是集中对象的全局上下文的动机，从而使对象形状更准确地分割。这种全球环境意识也使我们的模型在推理期间没有看到的本地扰动，例如医学图像中普遍存在的添加噪声或运动模糊。将FCSN与3个医疗图像分割任务（ISIC \ _2018，RIM \ _CUP，RIM \ _disc）进行比较时，将FCSN（iSIC \ _2018，rim \ _disc）进行比较。（14 \％）分别在3个任务上。此外，FCSN可以通过丢弃解码器模块轻巧，从而产生了大量的计算开销。 FCSN仅需要比UNETR和DEEPLABV3+的参数222万参数，82m和10m。 FCSN的推理和训练速度为1.6ms/img和6.3ms/img，即8 $ \ times $，比UNET和UNET的速度快3 $ \ times $。

The encoder-decoder model is a commonly used Deep Neural Network (DNN) model for medical image segmentation. Conventional encoder-decoder models make pixel-wise predictions focusing heavily on local patterns around the pixel. This makes it challenging to give segmentation that preserves the object's shape and topology, which often requires an understanding of the global context of the object. In this work, we propose a Fourier Coefficient Segmentation Network~(FCSN) -- a novel DNN-based model that segments an object by learning the complex Fourier coefficients of the object's masks. The Fourier coefficients are calculated by integrating over the whole contour. Therefore, for our model to make a precise estimation of the coefficients, the model is motivated to incorporate the global context of the object, leading to a more accurate segmentation of the object's shape. This global context awareness also makes our model robust to unseen local perturbations during inference, such as additive noise or motion blur that are prevalent in medical images. When FCSN is compared with other state-of-the-art models (UNet+, DeepLabV3+, UNETR) on 3 medical image segmentation tasks (ISIC\_2018, RIM\_CUP, RIM\_DISC), FCSN attains significantly lower Hausdorff scores of 19.14 (6\%), 17.42 (6\%), and 9.16 (14\%) on the 3 tasks, respectively. Moreover, FCSN is lightweight by discarding the decoder module, which incurs significant computational overhead. FCSN only requires 22.2M parameters, 82M and 10M fewer parameters than UNETR and DeepLabV3+. FCSN attains inference and training speeds of 1.6ms/img and 6.3ms/img, that is 8$\times$ and 3$\times$ faster than UNet and UNETR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题