论文标题
基于事件的机器人抓听检测,使用神经形态视觉传感器和事件流数据集
Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor and Event-Stream Dataset
论文作者
论文摘要
机器人抓握在机器人技术领域起着重要作用。当前的最新机器人抓物检测系统通常建立在常规视觉上,例如RGB-D摄像头。与传统的基于框架的计算机视觉相比,神经形态视觉是一个小小的年轻研究社区。当前,由于异步事件流的麻烦注释,基于事件的数据集有限。注释大规模视觉数据集通常需要大量计算资源,尤其是用于视频级注释的麻烦数据。在这项工作中,我们考虑了在包含对象的场景的移动相机视图中检测机器人抓取的问题。为了获得更敏捷的机器人感知,引入了连接到机器人抓手的神经形态视觉传感器(戴维斯),以探索抓地检测的潜在用法。我们构建了一个具有91个对象的机器人握把数据集。提出了一个时空混合粒子滤波器(SMP滤波器)来跟踪基于LED的GRASP矩形,该矩形可以实现每个对象的视频级别注释。当LED以高频闪烁时,事件流数据集的高频为1 kHz。基于事件流数据集,我们开发了一个深层神经网络,用于抓住角度学习问题是分类而不是回归。该方法在我们的事件流数据集上执行高检测精度,在对象级别上具有93%的精度。这项工作提供了一个大规模且通知的数据集,并促进了敏捷机器人中的神经形态视觉应用。
Robotic grasping plays an important role in the field of robotics. The current state-of-the-art robotic grasping detection systems are usually built on the conventional vision, such as RGB-D camera. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research. Currently, there are limited event-based datasets due to the troublesome annotation of the asynchronous event stream. Annotating large scale vision dataset often takes lots of computation resources, especially the troublesome data for video-level annotation. In this work, we consider the problem of detecting robotic grasps in a moving camera view of a scene containing objects. To obtain more agile robotic perception, a neuromorphic vision sensor (DAVIS) attaching to the robot gripper is introduced to explore the potential usage in grasping detection. We construct a robotic grasping dataset named Event-Stream Dataset with 91 objects. A spatio-temporal mixed particle filter (SMP Filter) is proposed to track the led-based grasp rectangles which enables video-level annotation of a single grasp rectangle per object. As leds blink at high frequency, the Event-Stream dataset is annotated in a high frequency of 1 kHz. Based on the Event-Stream dataset, we develop a deep neural network for grasping detection which consider the angle learning problem as classification instead of regression. The method performs high detection accuracy on our Event-Stream dataset with 93% precision at object-wise level. This work provides a large-scale and well-annotated dataset, and promotes the neuromorphic vision applications in agile robot.