论文标题

部分可观测时空混沌系统的无模型预测

Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning

论文作者

Liu, Zhe, Li, Yun, Yao, Lina, Chang, Xiaojun, Fang, Wei, Wu, Xiaojun, Yang, Yi

论文摘要

组成零学习(CZSL)的任务是识别训练阶段缺乏的新型状态对象组成的图像。以前的学习组成嵌入方法已显示出在封闭世界CZSL中的有效性。但是,在开放世界的CZSL(OW-CZSL)中,由于可能的组成的基数较大,它们的性能往往会大大降解。最近的一些作品分别预测了简单的原语(即状态和对象),以降低基数。但是,他们将简单的原语视为独立的概率分布,而忽略了状态,对象和组成之间的繁重依赖性。在本文中,我们通过可行性和上下文性对组成的依赖性进行建模。可行性依赖性是指简单基原始人之间的不等性关系,例如,\ textit {hirdy}对于\ textit {cat}比在现实世界中的\ textit {building}更可行。上下文依赖性表示图像中的上下文差异,例如,\ textit {cat}在\ textit {dry}和\ textit {wit}状态下显示了各种外观。我们设计语义关注(SA)和生成知识分解(KD),以分别学习可行性和上下文性的依赖性。 SA捕获了构图中的语义,以减轻不可能的预测,这是由简单原语之间的视觉相似性驱动的。 KD将图像置于公正的特征表示中,从而缓解了预测中的上下文偏见。此外,我们以兼容格式的可行性和上下文为当前的组成概率模型进行了补充。最后,我们进行了全面的实验,以在三个广泛使用的基准OW-CZSL数据集上分析和验证模型的出色或竞争性能,语义关注和知识分离引导的简单基础(SAD-SP)。

The task of Compositional Zero-Shot Learning (CZSL) is to recognize images of novel state-object compositions that are absent during the training stage. Previous methods of learning compositional embedding have shown effectiveness in closed-world CZSL. However, in Open-World CZSL (OW-CZSL), their performance tends to degrade significantly due to the large cardinality of possible compositions. Some recent works separately predict simple primitives (i.e., states and objects) to reduce cardinality. However, they consider simple primitives as independent probability distributions, ignoring the heavy dependence between states, objects, and compositions. In this paper, we model the dependence of compositions via feasibility and contextuality. Feasibility-dependence refers to the unequal feasibility relations between simple primitives, e.g., \textit{hairy} is more feasible with \textit{cat} than with \textit{building} in the real world. Contextuality-dependence represents the contextual variance in images, e.g., \textit{cat} shows diverse appearances under the state of \textit{dry} and \textit{wet}. We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively. SA captures semantics in compositions to alleviate impossible predictions, driven by the visual similarity between simple primitives. KD disentangles images into unbiased feature representations, easing contextual bias in predictions. Moreover, we complement the current compositional probability model with feasibility and contextuality in a compatible format. Finally, we conduct comprehensive experiments to analyze and validate the superior or competitive performance of our model, Semantic Attention and knowledge Disentanglement guided Simple Primitives (SAD-SP), on three widely-used benchmark OW-CZSL datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源