论文标题
主动辅助示例性 - 基于深度学习的立体声方法的跨概要功能为基准测试
Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods
论文作者
论文摘要
在立体声视觉中,自相似或平淡的区域可能使得很难匹配两个图像之间的补丁。基于主动立体声的方法通过在场景上投射一个伪随机模式来减轻此问题,以便可以在没有歧义的情况下识别图像对的每个贴片。但是,投影模式显着改变了图像的外观。如果这种模式是一种对抗性噪声的一种形式,则可能对基于深度学习的方法的性能产生负面影响,这现在是密集立体声视觉的事实上的标准。在本文中,我们提出了主动辅助型示例数据集和相应的基准,以评估立体声匹配算法的被动立体声和主动立体声图像之间的性能差距。使用拟议的基准测试和额外的消融研究,我们表明特征提取和匹配的模块选择了二十种选定的基于学习的基于深度学习的立体声匹配方法,可以推广到主动立体声,而没有问题。但是,由于二十个体系结构(ACVNet,cascadestereo和stereonet)中三个的差异细化模块由于对输入图像的外观的依赖而受到主动立体声模式的负面影响。
In stereo vision, self-similar or bland regions can make it difficult to match patches between two images. Active stereo-based methods mitigate this problem by projecting a pseudo-random pattern on the scene so that each patch of an image pair can be identified without ambiguity. However, the projected pattern significantly alters the appearance of the image. If this pattern acts as a form of adversarial noise, it could negatively impact the performance of deep learning-based methods, which are now the de-facto standard for dense stereo vision. In this paper, we propose the Active-Passive SimStereo dataset and a corresponding benchmark to evaluate the performance gap between passive and active stereo images for stereo matching algorithms. Using the proposed benchmark and an additional ablation study, we show that the feature extraction and matching modules of a selection of twenty selected deep learning-based stereo matching methods generalize to active stereo without a problem. However, the disparity refinement modules of three of the twenty architectures (ACVNet, CascadeStereo, and StereoNet) are negatively affected by the active stereo patterns due to their reliance on the appearance of the input images.