论文标题
在潜在空间中丢失:分离的模型和组合概括的挑战
Lost in Latent Space: Disentangled Models and the Challenge of Combinatorial Generalisation
论文作者
论文摘要
最近的研究表明,具有高度分解表示形式的生成模型无法概括地看不见生成因子值的组合。这些发现与早期的研究相矛盾,该研究表明,与纠缠表示相比,在训练外部分配环境中的性能有所提高。此外,尚不清楚报告的故障是否是由于(a)无法将新颖组合映射到潜在空间的适当区域的编码器或(b)正确映射的新型组合,但是解码器/下游过程无法为未看到的组合提供正确的输出。我们通过在一系列数据集和培训设置上测试多个模型来研究这些替代方案。我们发现(i)当模型失败时,他们的编码器也无法映射看不见的组合以纠正潜在空间的区域,并且(ii)当模型成功时,这是因为测试条件不排除足够的示例,或者是因为排除的生成因子决定了输出图像的独立部分。基于这些结果,我们认为要正确概括,模型不仅需要捕获变异因素,而且还要了解如何颠倒用于生成数据的生成过程。
Recent research has shown that generative models with highly disentangled representations fail to generalise to unseen combination of generative factor values. These findings contradict earlier research which showed improved performance in out-of-training distribution settings when compared to entangled representations. Additionally, it is not clear if the reported failures are due to (a) encoders failing to map novel combinations to the proper regions of the latent space or (b) novel combinations being mapped correctly but the decoder/downstream process is unable to render the correct output for the unseen combinations. We investigate these alternatives by testing several models on a range of datasets and training settings. We find that (i) when models fail, their encoders also fail to map unseen combinations to correct regions of the latent space and (ii) when models succeed, it is either because the test conditions do not exclude enough examples, or because excluded generative factors determine independent parts of the output image. Based on these results, we argue that to generalise properly, models not only need to capture factors of variation, but also understand how to invert the generative process that was used to generate the data.