自我监督的人类活动识别具有局部时频的对比表示学习

论文标题

自我监督的人类活动识别具有局部时频的对比表示学习

Self-Supervised Human Activity Recognition with Localized Time-Frequency Contrastive Representation Learning

论文作者

Taghanaki, Setareh Rahimi, Rainbow, Michael, Etemad, Ali

论文摘要

在本文中，我们建议使用智能手机加速度计数据为人类活动识别的自制学习解决方案。我们旨在开发一个模型，该模型从加速度计信号中学习强烈的表示，以执行强大的人类活动分类，同时减少模型对类标签的依赖。具体而言，我们打算启用跨数据集传输学习，以便我们在特定数据集中预先训练的网络可以在其他数据集上执行有效的活动分类（连续到少量的微调）。为了解决这个问题，我们设计解决方案的目的是从加速度计信号中学习尽可能多的信息。结果，我们设计了两个单独的管道，一个管道可以在时频域中学习数据，另一个单独在时间域中学习数据。为了解决上述有关跨数据库转移学习的问题，我们使用自欺欺人的对比学习来训练这些流中的每一个。接下来，每个流进行微调以进行最终分类，最终将两者融合以提供最终结果。我们评估了在三个数据集上提出的解决方案的性能，即动作，触觉和HHAR，并证明我们的解决方案表现优于该领域的先前作品。我们通过使用MobiACT数据集进行预训练和剩余的三个数据集来进一步评估方法在学习通用特征方面的性能，以进行下游分类任务，并证明与其他在交叉数据库转移学习中相比，提出的解决方案与其他自我监督的方法相比，实现了更好的性能。

In this paper, we propose a self-supervised learning solution for human activity recognition with smartphone accelerometer data. We aim to develop a model that learns strong representations from accelerometer signals, in order to perform robust human activity classification, while reducing the model's reliance on class labels. Specifically, we intend to enable cross-dataset transfer learning such that our network pre-trained on a particular dataset can perform effective activity classification on other datasets (successive to a small amount of fine-tuning). To tackle this problem, we design our solution with the intention of learning as much information from the accelerometer signals as possible. As a result, we design two separate pipelines, one that learns the data in time-frequency domain, and the other in time-domain alone. In order to address the issues mentioned above in regards to cross-dataset transfer learning, we use self-supervised contrastive learning to train each of these streams. Next, each stream is fine-tuned for final classification, and eventually the two are fused to provide the final results. We evaluate the performance of the proposed solution on three datasets, namely MotionSense, HAPT, and HHAR, and demonstrate that our solution outperforms prior works in this field. We further evaluate the performance of the method in learning generalized features, by using MobiAct dataset for pre-training and the remaining three datasets for the downstream classification task, and show that the proposed solution achieves better performance in comparison with other self-supervised methods in cross-dataset transfer learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题