强大视觉挑战2022语义分割轨道的第一名解决方案

论文标题

强大视觉挑战2022语义分割轨道的第一名解决方案

1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

论文作者

Xiao, Junfei, Xu, Zhichao, Lan, Shiyi, Yu, Zhiding, Yuille, Alan, Anandkumar, Anima

论文摘要

本报告描述了ECCV 2022上强大的视觉挑战（RVC）语义细分轨道的获胜解决方案。我们的方法采用Fan-B-Hybrid模型作为编码器，并将Segformer用作分段框架。该模型在复合数据集上训练，该复合数据集由9个数据集（ADE20K，CityScapes，Mapillary Vistas，Scannet，Viper，Wilddash 2，IDD，BDD和可可）组成的图像组成。所有原始标签都投影到256级的统一标签空间，并使用跨侧面损失训练该模型。如果没有明显的超级参数调整或任何特定的损失权重，我们的解决方案在来自多个域的所有测试语义分割基准中排名第一（ADE20K，CityScapes，Mapillary Vistas，Scannet，Viper，Viper和Wilddash 2）。提出的方法可以作为多域细分任务的强大基准，并使未来的工作有益。代码将在https://github.com/lambert-x/rvc_segentation上找到。

This report describes the winning solution to the Robust Vision Challenge (RVC) semantic segmentation track at ECCV 2022. Our method adopts the FAN-B-Hybrid model as the encoder and uses SegFormer as the segmentation framework. The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy. All the original labels are projected to a 256-class unified label space, and the model is trained using a cross-entropy loss. Without significant hyperparameter tuning or any specific loss weighting, our solution ranks the first place on all the testing semantic segmentation benchmarks from multiple domains (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, and WildDash 2). The proposed method can serve as a strong baseline for the multi-domain segmentation task and benefit future works. Code will be available at https://github.com/lambert-x/RVC_Segmentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题