论文标题

迈向整体手术场景

Towards Holistic Surgical Scene Understanding

论文作者

Valderrama, Natalia, Puentes, Paola Ruiz, Hernández, Isabela, Ayobi, Nicolás, Verlyk, Mathilde, Santander, Jessica, Caicedo, Juan, Fernández, Nicolás, Arbeláez, Pablo

论文摘要

研究手术干预措施的大多数基准都集中在特定的挑战上,而不是利用不同任务之间的内在互补性。在这项工作中,我们提出了一个新的实验框架,以实现整体手术场景的理解。首先,我们介绍阶段,步骤,仪器和原子视觉动作识别(PSI-AVA)数据集。 PSI-AVA包括用于机器人辅助的根治性前列腺切除术视频中的长期(相和步骤识别)和短期推理(仪器检测和新的原子作用识别)的注释。其次,我们提出了用于动作,阶段,仪器和步骤识别(TAPIR)的变压器,作为手术场景理解的强大基准。 Tapir利用我们数据集的多级注释,因为它受益于在仪器检测任务上学习的表示,以提高其分类能力。我们在PSI-AVA和其他公开可用数据库中的实验结果证明了我们框架的适当性,以刺激对整体手术现场理解的未来研究。

Most benchmarks for studying surgical interventions focus on a specific challenge instead of leveraging the intrinsic complementarity among different tasks. In this work, we present a new experimental framework towards holistic surgical scene understanding. First, we introduce the Phase, Step, Instrument, and Atomic Visual Action recognition (PSI-AVA) Dataset. PSI-AVA includes annotations for both long-term (Phase and Step recognition) and short-term reasoning (Instrument detection and novel Atomic Action recognition) in robot-assisted radical prostatectomy videos. Second, we present Transformers for Action, Phase, Instrument, and steps Recognition (TAPIR) as a strong baseline for surgical scene understanding. TAPIR leverages our dataset's multi-level annotations as it benefits from the learned representation on the instrument detection task to improve its classification capacity. Our experimental results in both PSI-AVA and other publicly available databases demonstrate the adequacy of our framework to spur future research on holistic surgical scene understanding.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源