基于音频的深度学习框架，用于检测COVID-19

论文标题

基于音频的深度学习框架，用于检测COVID-19

Audio-Based Deep Learning Frameworks for Detecting COVID-19

论文作者

Ngo, Dat, Pham, Lam, Hoang, Truong, Kolozali, Sefki, Jarchi, Delaram

论文摘要

本文评估了用于检测COVID-19的呼吸，咳嗽和语音的广泛基于音频的深度学习框架。通常，音频记录输入转换为低级光谱图，然后将它们馈入预训练的深度学习模型中，以提取高级嵌入功能。接下来，在使用轻梯度提升机（LightGBM）作为后端分类之前，这些高级嵌入功能的尺寸会降低。我们对第二个DICOVA挑战的实验分别达到了曲线下的最高面积，F1得分，灵敏度得分和特异性评分分别为89.03％，64.41％，63.33％和95.13％。基于这些分数，我们的方法优于最先进的系统，并将挑战基线提高4.33％，6.00％和8.33％，分别为AUC，F1分数和敏感性得分。

This paper evaluates a wide range of audio-based deep learning frameworks applied to the breathing, cough, and speech sounds for detecting COVID-19. In general, the audio recording inputs are transformed into low-level spectrogram features, then they are fed into pre-trained deep learning models to extract high-level embedding features. Next, the dimension of these high-level embedding features are reduced before finetuning using Light Gradient Boosting Machine (LightGBM) as a back-end classification. Our experiments on the Second DiCOVA Challenge achieved the highest Area Under the Curve (AUC), F1 score, sensitivity score, and specificity score of 89.03%, 64.41%, 63.33%, and 95.13%, respectively. Based on these scores, our method outperforms the state-of-the-art systems, and improves the challenge baseline by 4.33%, 6.00% and 8.33% in terms of AUC, F1 score and sensitivity score, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题