野外的整体多视图建筑分析，并进行投影集合

论文标题

野外的整体多视图建筑分析，并进行投影集合

Holistic Multi-View Building Analysis in the Wild with Projection Pooling

论文作者

Wojna, Zbigniew, Maziarz, Krzysztof, Jocz, Łukasz, Pałuba, Robert, Kozikowski, Robert, Kokkinos, Iasonas

论文摘要

我们解决了与细粒建筑物属性有关的六个不同的分类任务：建筑类型，地板数量，屋顶的音高和几何形状，立面材料和占用类别。仅由于不断增加大型城市场景数据集，解决这种遥远的建筑物分析问题才成为可能。为此，我们介绍了一个新的基准测试数据集，该数据集由9674个建筑物的49426张图像（顶部和街道视图）组成。这些照片与几何元数据一起进一步组装。数据集展示了各种现实世界中的挑战，例如遮挡，模糊，部分可见的对象和广泛的建筑物。我们提出了一个新的投影合并层，在高维空间中创建了统一的顶级视图和侧视图。它使我们能够无缝地利用建筑物和图像元数据。与高度调谐的基线模型相比，引入该层提高了分类精度 - 表明其适合建筑物分析。

We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题