论文标题
带有3D盒先验的多平面计划感应
Multi-Plane Program Induction with 3D Box Priors
论文作者
论文摘要
我们考虑了理解和编辑图像的两个重要方面:在2D平面中建模常规,类似程序的纹理或模式,以及在场景中对这些平面的3D摆姿势。与假定图像包含单个可见2D平面的基于图像的程序合成的先前工作不同,我们提出了框程序感应(BPI),该盒子诱导(BPI),该图像呈现类似程序的场景表示形式,该场景表示在多个2D平面上重复结构,即平面的3D位置和方向,以及摄像机参数,均来自单个图像。我们的模型假定一个框,即图像捕获了3D框的内部视图或外部视图。它使用神经网络来推断视觉提示,例如消失点,线框线来指导基于搜索的算法来找到最能解释图像的程序。这样的整体结构化场景表示可以实现3D感知的交互式图像编辑操作,例如插入缺失的像素,更改相机参数并推断图像内容。
We consider two important aspects in understanding and editing images: modeling regular, program-like texture or patterns in 2D planes, and 3D posing of these planes in the scene. Unlike prior work on image-based program synthesis, which assumes the image contains a single visible 2D plane, we present Box Program Induction (BPI), which infers a program-like scene representation that simultaneously models repeated structure on multiple 2D planes, the 3D position and orientation of the planes, and camera parameters, all from a single image. Our model assumes a box prior, i.e., that the image captures either an inner view or an outer view of a box in 3D. It uses neural networks to infer visual cues such as vanishing points, wireframe lines to guide a search-based algorithm to find the program that best explains the image. Such a holistic, structured scene representation enables 3D-aware interactive image editing operations such as inpainting missing pixels, changing camera parameters, and extrapolate the image contents.