Fredsnet：快速傅立叶卷积的关节单眼深度和语义分割

论文标题

Fredsnet：快速傅立叶卷积的关节单眼深度和语义分割

FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions

论文作者

Berenguel-Baeta, Bruno, Bermudez-Cameo, Jesus, Guerrero, Jose J.

论文摘要

在这项工作中，我们提出了弗雷德斯（Fredsnet），这是一种深度学习解决方案，可以从单个全景中获得对室内环境的语义3D理解。全向图像在解决场景时，由于有关其提供的整个环境的360度上下文信息而引起的场景问题时，揭示了特定于任务的优势。但是，全向图像的固有特征增加了其他问题，以获得对象的准确检测和分割或良好的深度估计。为了克服这些问题，我们利用了在每个卷积层中获得更广泛接受场的频繁域中的卷积。这些卷积允许从全向图像中利用整个上下文信息。 Fredsnet是第一个从单个全景图像中共同提供单眼深度估计和语义分割的网络，从而利用快速的傅立叶卷积。我们的实验表明，Fredsnet的性能与用于语义分割和深度估计的艺术方法的特定状态相似。 Fredsnet代码在https://github.com/sbrunoberenguel/fredsnet中公开可用

In this work we present FreDSNet, a deep learning solution which obtains semantic 3D understanding of indoor environments from single panoramas. Omnidirectional images reveal task-specific advantages when addressing scene understanding problems due to the 360-degree contextual information about the entire environment they provide. However, the inherent characteristics of the omnidirectional images add additional problems to obtain an accurate detection and segmentation of objects or a good depth estimation. To overcome these problems, we exploit convolutions in the frequential domain obtaining a wider receptive field in each convolutional layer. These convolutions allow to leverage the whole context information from omnidirectional images. FreDSNet is the first network that jointly provides monocular depth estimation and semantic segmentation from a single panoramic image exploiting fast Fourier convolutions. Our experiments show that FreDSNet has similar performance as specific state of the art methods for semantic segmentation and depth estimation. FreDSNet code is publicly available in https://github.com/Sbrunoberenguel/FreDSNet

下载PDF全文

下载文献需遵守相关版权规定

论文标题