TTNET：乒乓球的实时时间和空间视频分析

论文标题

TTNET：乒乓球的实时时间和空间视频分析

TTNet: Real-time temporal and spatial video analysis of table tennis

论文作者

Voeikov, Roman, Falaleev, Nikolay, Baikulov, Ruslan

论文摘要

我们提出了一个旨在实时处理高分辨率乒乓球视频的神经网络TTNET，可提供时间（事件发现）和空间（球检测和语义分段）数据。这种方法提供了由自动掠夺性系统进行推理得分更新的核心信息。我们还发布了一个多任务数据集openttgames，其中包含120 fps的乒乓球游戏的视频，其中标有事件，语义分割掩码和球坐标，用于评估多任务方法，主要针对快速事件和小对象跟踪而面向发现快速事件。 TTNET在游戏事件中显示了97.0％的精度，并在圆球检测中发现了2个像素RMSE，在提出的数据集的测试部分中，精度为97.5％。该提议的网络允许处理缩小的全高清视频，其推理时间低于6毫秒的每个输入张量在具有单个消费级GPU的计算机上。因此，我们为开发实时多任务深度学习应用程序和呈现方法做出贡献，这有可能能够由运动童子军替换手动数据，为裁判的决策提供支持，并收集有关游戏过程的额外信息。

We present a neural network TTNet aimed at real-time processing of high-resolution table tennis videos, providing both temporal (events spotting) and spatial (ball detection and semantic segmentation) data. This approach gives core information for reasoning score updates by an auto-referee system. We also publish a multi-task dataset OpenTTGames with videos of table tennis games in 120 fps labeled with events, semantic segmentation masks, and ball coordinates for evaluation of multi-task approaches, primarily oriented on spotting of quick events and small objects tracking. TTNet demonstrated 97.0% accuracy in game events spotting along with 2 pixels RMSE in ball detection with 97.5% accuracy on the test part of the presented dataset. The proposed network allows the processing of downscaled full HD videos with inference time below 6 ms per input tensor on a machine with a single consumer-grade GPU. Thus, we are contributing to the development of real-time multi-task deep learning applications and presenting approach, which is potentially capable of substituting manual data collection by sports scouts, providing support for referees' decision-making, and gathering extra information about the game process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题