论文标题
在工业人类机器人合作中的姿势预测
Pose Forecasting in Industrial Human-Robot Collaboration
论文作者
论文摘要
为了向后推进工业环境中协作机器人的前沿,我们提出了一个新的可分开的Sparse图卷积网络(SES-GCN),以进行姿势预测。 SES-GCN首次瓶颈是GCN中的空间,时间和频道维度的相互作用,并通过教师学生框架来学习稀疏的邻接矩阵。与最先进的参数相比,它仅使用1.72%的参数,并且速度约为4倍,同时仍然在未来1秒钟对人类的360万次预测准确性相当,这使得配件能够意识到人类操作员。作为第二个贡献,我们介绍了工业合作中的柯比特人和人类的新基准(Chico)。 Chico包括20个人类操作员和配角的多视频视频,3D姿势和轨迹,从事7种现实的工业行动。此外,它报告了226次真正的碰撞,发生在人类 - 库特相互作用期间。我们在Chico上测试了SES-GCN在机器人技术中的两项重要感知任务:人类姿势预测,未来1秒钟的平均误差达到85.3毫米(MPJPE),运行时间为2.3毫秒,并通过将预测的人类运动与已知的Cobot运动进行比较,从而获得了F1-S-S-SERSERE,并通过比较已知的Cobot运动,以进行碰撞检测。
Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting. For the first time, SeS-GCN bottlenecks the interaction of the spatial, temporal and channel-wise dimensions in GCNs, and it learns sparse adjacency matrices by a teacher-student framework. Compared to the state-of-the-art, it only uses 1.72% of the parameters and it is ~4 times faster, while still performing comparably in forecasting accuracy on Human3.6M at 1 second in the future, which enables cobots to be aware of human operators. As a second contribution, we present a new benchmark of Cobots and Humans in Industrial COllaboration (CHICO). CHICO includes multi-view videos, 3D poses and trajectories of 20 human operators and cobots, engaging in 7 realistic industrial actions. Additionally, it reports 226 genuine collisions, taking place during the human-cobot interaction. We test SeS-GCN on CHICO for two important perception tasks in robotics: human pose forecasting, where it reaches an average error of 85.3 mm (MPJPE) at 1 sec in the future with a run time of 2.3 msec, and collision detection, by comparing the forecasted human motion with the known cobot motion, obtaining an F1-score of 0.64.