多得分睡眠数据库：如何利用自动睡眠评分中的多标签

论文标题

多得分睡眠数据库：如何利用自动睡眠评分中的多标签

Multi-Scored Sleep Databases: How to Exploit the Multiple-Labels in Automated Sleep Scoring

论文作者

Fiorillo, Luigi, Pedroncelli, Davide, Agostini, Valentina, Favaro, Paolo, Faraci, Francesca Dalia

论文摘要

研究目的：评分多词的评分率变异性是一个众所周知的问题。大多数现有的自动睡眠评分系统都是使用单个评分者注释的标签培训的，该标签将主观评估转移到模型中。当有两个或多个得分手的注释可用时，评分模型通常会在得分手共识上训练。平均得分手的主观性被转移到模型中，失去了有关不同得分子之间内部变异性的信息。在这项研究中，我们旨在将不同医生的多重知识插入培训程序中。目的是优化模型培训，利用可以从一组得分手共识中提取的完整信息。方法：我们在三个不同的多得分数据库上训练两个基于深度学习的模型。我们将标签平滑技术与软传感器（LSSC）分布一起利用标签平滑技术，以在模型的训练过程中插入多重知识。我们介绍了平均的余弦相似性度量（ACS），以量化模型与LSSC产生的催眠式图与得分师共识产生的催眠刻画之间的相似性。结果：当我们使用LSSC训练模型时，模型的性能会改善所有数据库。我们发现，通过LSSC训练的模型和共识产生的催眠仪型的催眠刻画之间的ACS增加（高达6.4％）。结论：我们的方法绝对使模型能够更好地适应得分手的共识。未来的工作将集中于对不同评分体系结构和希望大规模的多得分数据集的进一步调查。

Study Objectives: Inter-scorer variability in scoring polysomnograms is a well-known problem. Most of the existing automated sleep scoring systems are trained using labels annotated by a single scorer, whose subjective evaluation is transferred to the model. When annotations from two or more scorers are available, the scoring models are usually trained on the scorer consensus. The averaged scorer's subjectivity is transferred into the model, losing information about the internal variability among different scorers. In this study, we aim to insert the multiple-knowledge of the different physicians into the training procedure. The goal is to optimize a model training, exploiting the full information that can be extracted from the consensus of a group of scorers. Methods: We train two lightweight deep learning based models on three different multi-scored databases. We exploit the label smoothing technique together with a soft-consensus (LSSC) distribution to insert the multiple-knowledge in the training procedure of the model. We introduce the averaged cosine similarity metric (ACS) to quantify the similarity between the hypnodensity-graph generated by the models with-LSSC and the hypnodensity-graph generated by the scorer consensus. Results: The performance of the models improves on all the databases when we train the models with our LSSC. We found an increase in ACS (up to 6.4%) between the hypnodensity-graph generated by the models trained with-LSSC and the hypnodensity-graph generated by the consensus. Conclusion: Our approach definitely enables a model to better adapt to the consensus of the group of scorers. Future work will focus on further investigations on different scoring architectures and hopefully large-scale-heterogeneous multi-scored datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题