具有外部知识的显着性预测

论文标题

具有外部知识的显着性预测

Saliency Prediction with External Knowledge

论文作者

Zhang, Yifeng, Jiang, Ming, Zhao, Qi

论文摘要

在过去的几十年中，显着性预测取得了巨大进展，深层神经网络能够编码高级语义。然而，尽管人类具有先天的能力来利用他们的知识来决定在哪里看（例如，人们更多地关注诸如名人之类的熟悉面孔），但显着性预测模型仅接受了大型眼睛跟踪数据集的培训。这项工作建议通过像人类那样明确纳入显着模型的外部知识来弥合这一差距。我们开发网络，通过纳入语义关系的先验知识，无论是一般还是特定于领域，可以根据感兴趣的任务来突出区域。该方法的核心是一个新的图语义显着性网络（GRANGNET），该图形构造了一个图形，该图编码从外部知识中学到的语义关系。然后，开发了一个空间图注意网络，以根据学习图来更新显着性功能。实验表明，所提出的模型学会了从外部知识中预测显着性，并在四个显着基准上的最先进。

The last decades have seen great progress in saliency prediction, with the success of deep neural networks that are able to encode high-level semantics. Yet, while humans have the innate capability in leveraging their knowledge to decide where to look (e.g. people pay more attention to familiar faces such as celebrities), saliency prediction models have only been trained with large eye-tracking datasets. This work proposes to bridge this gap by explicitly incorporating external knowledge for saliency models as humans do. We develop networks that learn to highlight regions by incorporating prior knowledge of semantic relationships, be it general or domain-specific, depending on the task of interest. At the core of the method is a new Graph Semantic Saliency Network (GraSSNet) that constructs a graph that encodes semantic relationships learned from external knowledge. A Spatial Graph Attention Network is then developed to update saliency features based on the learned graph. Experiments show that the proposed model learns to predict saliency from the external knowledge and outperforms the state-of-the-art on four saliency benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题