论文标题
多晶型典型的外观合并用于遥感场景分类
Multi-Granularity Canonical Appearance Pooling for Remote Sensing Scene Classification
论文作者
论文摘要
由于较大的视觉语义差异,识别遥感场景图像仍然具有挑战性。这些主要是由于缺乏详细注释而出现的,这些注释可用于使像素级表示与高级语义标签保持一致。由于标记过程是劳动密集型和主观的,因此我们在此提出了一种新型的多晶型典型外观池(MG-CAP),以自动捕获遥感数据集的潜在本体论结构。我们设计了一个颗粒状框架,允许逐步裁剪输入图像以学习多透明功能。对于每个特定的粒度,我们从一组预定义的转换中发现了典型的外观,并通过基于Maxout的Siamese样式体系结构来学习相应的CNN功能。然后,我们用高斯协方差矩阵替换标准CNN功能,并采用适当的矩阵正常化来提高特征的判别能力。此外,我们还提供了一种稳定的解决方案,用于在GPU中训练特征值分解函数(EIG),并使用矩阵计算证明了相应的背部传播。广泛的实验表明,我们的框架可以在公共遥感场景数据集中获得有希望的结果。
Recognising remote sensing scene images remains challenging due to large visual-semantic discrepancies. These mainly arise due to the lack of detailed annotations that can be employed to align pixel-level representations with high-level semantic labels. As the tagging process is labour-intensive and subjective, we hereby propose a novel Multi-Granularity Canonical Appearance Pooling (MG-CAP) to automatically capture the latent ontological structure of remote sensing datasets. We design a granular framework that allows progressively cropping the input image to learn multi-grained features. For each specific granularity, we discover the canonical appearance from a set of pre-defined transformations and learn the corresponding CNN features through a maxout-based Siamese style architecture. Then, we replace the standard CNN features with Gaussian covariance matrices and adopt the proper matrix normalisations for improving the discriminative power of features. Besides, we provide a stable solution for training the eigenvalue-decomposition function (EIG) in a GPU and demonstrate the corresponding back-propagation using matrix calculus. Extensive experiments have shown that our framework can achieve promising results in public remote sensing scene datasets.