通过自我监督学习发现临床EEG信号的结构

论文标题

通过自我监督学习发现临床EEG信号的结构

Uncovering the structure of clinical EEG signals with self-supervised learning

论文作者

Banville, Hubert, Chehab, Omar, Hyvärinen, Aapo, Engemann, Denis-Alexander, Gramfort, Alexandre

论文摘要

客观的。监督的学习范例通常受到可用标记数据量的限制。这种现象在临床上的数据（例如脑电图（EEG））中尤其有问题，在脑电图（EEG）中，在专门的专业知识和人类处理时间方面，标签可能会昂贵。因此，旨在学习脑电图数据的深度学习体系结构产生了相对较浅的模型和表演，最佳类似于传统特征方法的模型和表演。但是，在大多数情况下，无标记的数据可用。通过从此未标记数据中提取信息，尽管可以使用标签有限，但有可能通过深度神经网络达到竞争性能。方法。我们研究了自我监督学习（SSL），这是一种在未标记数据中发现结构的有前途的技术，以学习脑电图信号的表示。具体而言，我们根据时间上下文预测探索了两项任务，以及关于两个临床上与临床相关的问题的对比预测编码：基于脑电图的睡眠分期和病理检测。我们在两个大型公共数据集上进行了实验，并进行了数千张录音，并进行了纯粹监督和手工设计方法的基线比较。主要结果。接受SSL学习功能培训的线性分类器在低标记的数据制度中始终优于纯粹监督的深层神经网络，同时在所有标签可用时达到竞争性能。此外，用每种方法学到的嵌入揭示了与生理和临床现象有关的清晰的潜在结构，例如年龄效应。意义。我们证明了在脑电图数据上的自我监督学习方法的好处。我们的结果表明，SSL可能为在脑电图数据上更广泛地使用深度学习模型铺平了道路。

Objective. Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG), where labeling can be costly in terms of specialized expertise and human processing time. Consequently, deep learning architectures designed to learn on EEG data have yielded relatively shallow models and performances at best similar to those of traditional feature-based approaches. However, in most situations, unlabeled data is available in abundance. By extracting information from this unlabeled data, it might be possible to reach competitive performance with deep neural networks despite limited access to labels. Approach. We investigated self-supervised learning (SSL), a promising technique for discovering structure in unlabeled data, to learn representations of EEG signals. Specifically, we explored two tasks based on temporal context prediction as well as contrastive predictive coding on two clinically-relevant problems: EEG-based sleep staging and pathology detection. We conducted experiments on two large public datasets with thousands of recordings and performed baseline comparisons with purely supervised and hand-engineered approaches. Main results. Linear classifiers trained on SSL-learned features consistently outperformed purely supervised deep neural networks in low-labeled data regimes while reaching competitive performance when all labels were available. Additionally, the embeddings learned with each method revealed clear latent structures related to physiological and clinical phenomena, such as age effects. Significance. We demonstrate the benefit of self-supervised learning approaches on EEG data. Our results suggest that SSL may pave the way to a wider use of deep learning models on EEG data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题