论文标题
场景图生成:全面的调查
Scene Graph Generation: A Comprehensive Survey
论文作者
论文摘要
深度学习技术导致了通用对象检测领域的显着突破,并且近年来催生了许多场景的任务。场景图成为研究的重点,因为它具有强大的语义表示和对场景理解的应用。场景图生成(SGG)是指自动将图像映射到语义结构场景图中的任务,该图像需要正确标记检测对象及其关系。尽管这是一项具有挑战性的任务,但社区提出了许多SGG方法并取得了良好的结果。在本文中,我们对深度学习技术带来的该领域的最新成就进行了全面的调查。我们回顾了138个涵盖不同输入方式的代表性作品,并从特征提取和融合的角度系统地总结了基于图像的SGG的现有方法。我们试图以全面的方式连接和系统化现有的视觉关系检测方法,以总结和解释SGG的机制和策略。最后,我们通过有关当前现有问题和未来研究方向的深入讨论来完成这项调查。这项调查将帮助读者更好地了解当前的研究状况和思想。
Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph, which requires the correct labeling of detected objects and their relationships. Although this is a challenging task, the community has proposed a lot of SGG approaches and achieved good results. In this paper, we provide a comprehensive survey of recent achievements in this field brought about by deep learning techniques. We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG from the perspective of feature extraction and fusion. We attempt to connect and systematize the existing visual relationship detection methods, to summarize, and interpret the mechanisms and the strategies of SGG in a comprehensive way. Finally, we finish this survey with deep discussions about current existing problems and future research directions. This survey will help readers to develop a better understanding of the current research status and ideas.