论文标题
野外的整体多视图建筑分析,并进行投影集合
Holistic Multi-View Building Analysis in the Wild with Projection Pooling
论文作者
论文摘要
我们解决了与细粒建筑物属性有关的六个不同的分类任务:建筑类型,地板数量,屋顶的音高和几何形状,立面材料和占用类别。仅由于不断增加大型城市场景数据集,解决这种遥远的建筑物分析问题才成为可能。为此,我们介绍了一个新的基准测试数据集,该数据集由9674个建筑物的49426张图像(顶部和街道视图)组成。这些照片与几何元数据一起进一步组装。数据集展示了各种现实世界中的挑战,例如遮挡,模糊,部分可见的对象和广泛的建筑物。我们提出了一个新的投影合并层,在高维空间中创建了统一的顶级视图和侧视图。它使我们能够无缝地利用建筑物和图像元数据。与高度调谐的基线模型相比,引入该层提高了分类精度 - 表明其适合建筑物分析。
We address six different classification tasks related to fine-grained building attributes: construction type, number of floors, pitch and geometry of the roof, facade material, and occupancy class. Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes. To this end, we introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings. These photos are further assembled, together with the geometric metadata. The dataset showcases various real-world challenges, such as occlusions, blur, partially visible objects, and a broad spectrum of buildings. We propose a new projection pooling layer, creating a unified, top-view representation of the top-view and the side views in a high-dimensional space. It allows us to utilize the building and imagery metadata seamlessly. Introducing this layer improves classification accuracy -- compared to highly tuned baseline models -- indicating its suitability for building analysis.