对机器人视觉应用的服装的形状和重量进行分类的连续感知

论文标题

对机器人视觉应用的服装的形状和重量进行分类的连续感知

Continuous Perception for Classifying Shapes and Weights of Garmentsfor Robotic Vision Applications

论文作者

Duan, Li, Aragon-Camarasa, Gerardo

论文摘要

我们提出了一种用于机器人洗衣任务的连续感知的方法。我们的假设是，通过神经网络可以从视频序列中学习服装的动态变化，可以通过神经网络进行视觉预测。在训练期间，通过输入连续帧来利用连续的感知，网络学习服装的变形方式。为了评估我们的假设，我们在操纵服装时捕获了一个40K RGB和40K深度视频序列的数据集。我们还进行了消融研究，以了解神经网络是否了解服装的物理和动态特性。我们的发现表明，经过修改的Alexnet-LSTM体系结构对服装的形状和重量具有最佳的分类性能。为了进一步提供证据表明，连续感知有助于对服装的形状和权重的预测，我们在看不见的视频序列上评估了我们的网络，并通过一系列预测计算了“移动平均值”。我们发现，对于服装的形状和重量，我们的网络的分类精度分别为48％和60％。

We present an approach to continuous perception for robotic laundry tasks. Our assumption is that the visual prediction of a garment's shapes and weights is possible via a neural network that learns the dynamic changes of garments from video sequences. Continuous perception is leveraged during training by inputting consecutive frames, of which the network learns how a garment deforms. To evaluate our hypothesis, we captured a dataset of 40K RGB and 40K depth video sequences while a garment is being manipulated. We also conducted ablation studies to understand whether the neural network learns the physical and dynamic properties of garments. Our findings suggest that a modified AlexNet-LSTM architecture has the best classification performance for the garment's shape and weights. To further provide evidence that continuous perception facilitates the prediction of the garment's shapes and weights, we evaluated our network on unseen video sequences and computed the 'Moving Average' over a sequence of predictions. We found that our network has a classification accuracy of 48% and 60% for shapes and weights of garments, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题