论文标题
用于医疗图像细分的瓶颈
CapsNet for Medical Image Segmentation
论文作者
论文摘要
卷积神经网络(CNN)成功地解决了计算机视觉中的任务,包括医疗图像分割,因为它们能够自动从非结构化数据中提取功能。但是,CNN对旋转和仿射转换敏感,其成功依赖于捕获各种输入变化的大规模标记的数据集。该网络范式在大规模上提出了挑战,因为获取注释的医疗分段数据很昂贵,而且严格的隐私法规。此外,使用CNN的视觉表示学习有其自身的缺陷,例如,可以说的是,传统CNN中的汇总层倾向于丢弃位置信息,而CNN往往会失败的输入图像,而输入图像的方向和大小不同。胶囊网络(CAPSNET)是一种新的新体系结构,它通过用动态路由和卷积步伐替换池层来实现更好的鲁棒性,该层次在流行任务上显示了潜在的结果,例如分类,识别,细分,细分和自然语言处理。与CNN不同,CNN会导致标量输出,Capsnet返回向量输出,旨在保留零件整体关系。在这项工作中,我们首先介绍了CNN和CAPSNET基本面的局限性。然后,我们为医疗图像分割的任务提供了CAPSNET的最新发展。我们最终讨论了各种有效的网络体系结构,以实现2D图像和3D体积医学图像分割的封闭网络。
Convolutional Neural Networks (CNNs) have been successful in solving tasks in computer vision including medical image segmentation due to their ability to automatically extract features from unstructured data. However, CNNs are sensitive to rotation and affine transformation and their success relies on huge-scale labeled datasets capturing various input variations. This network paradigm has posed challenges at scale because acquiring annotated data for medical segmentation is expensive, and strict privacy regulations. Furthermore, visual representation learning with CNNs has its own flaws, e.g., it is arguable that the pooling layer in traditional CNNs tends to discard positional information and CNNs tend to fail on input images that differ in orientations and sizes. Capsule network (CapsNet) is a recent new architecture that has achieved better robustness in representation learning by replacing pooling layers with dynamic routing and convolutional strides, which has shown potential results on popular tasks such as classification, recognition, segmentation, and natural language processing. Different from CNNs, which result in scalar outputs, CapsNet returns vector outputs, which aim to preserve the part-whole relationships. In this work, we first introduce the limitations of CNNs and fundamentals of CapsNet. We then provide recent developments of CapsNet for the task of medical image segmentation. We finally discuss various effective network architectures to implement a CapsNet for both 2D images and 3D volumetric medical image segmentation.