对不规则采样医学时间序列数据的深度学习方法的综述

论文标题

对不规则采样医学时间序列数据的深度学习方法的综述

A Review of Deep Learning Methods for Irregularly Sampled Medical Time Series Data

论文作者

Sun, Chenxi, Hong, Shenda, Song, Moxian, Li, Hongyan

论文摘要

不规则采样的时间序列（ISTS）数据在观测值和序列之间的不同采样率之间具有不规则的时间间隔。 IST通常出现在医疗保健，经济学和地球科学中。特别是在医疗环境中，广泛使用的电子健康记录（EHR）具有丰富的典型不规则采样医疗时间序列（ISMTS）数据。开发有关EHRS数据的深度学习方法对于个性化治疗，精确诊断和医疗管理至关重要。但是，直接将深度学习模型用于ISMTS数据是一个挑战。一方面，ISMTS数据具有串行内和串行间关系。应考虑本地和全球结构。另一方面，方法应考虑任务准确性和模型复杂性之间的权衡，并保持一般性和解释性。到目前为止，许多现有作品都试图解决上述问题并取得了良好的结果。在本文中，我们从技术和任务的角度回顾了这些深度学习方法。在技术驱动的观点下，我们将它们汇总为两类 - 缺少基于数据的方法和基于原始数据的方法。在任务驱动的视角下，我们还将它们汇总为两类 - 面向数据的和下游的任务为导向。对于他们每个人，我们都指出他们的优势和缺点。此外，我们实施了一些代表性的方法，并将它们与两个任务进行了比较。最后，我们讨论了这一领域的挑战和机遇。

Irregularly sampled time series (ISTS) data has irregular temporal intervals between observations and different sampling rates between sequences. ISTS commonly appears in healthcare, economics, and geoscience. Especially in the medical environment, the widely used Electronic Health Records (EHRs) have abundant typical irregularly sampled medical time series (ISMTS) data. Developing deep learning methods on EHRs data is critical for personalized treatment, precise diagnosis and medical management. However, it is challenging to directly use deep learning models for ISMTS data. On the one hand, ISMTS data has the intra-series and inter-series relations. Both the local and global structures should be considered. On the other hand, methods should consider the trade-off between task accuracy and model complexity and remain generality and interpretability. So far, many existing works have tried to solve the above problems and have achieved good results. In this paper, we review these deep learning methods from the perspectives of technology and task. Under the technology-driven perspective, we summarize them into two categories - missing data-based methods and raw data-based methods. Under the task-driven perspective, we also summarize them into two categories - data imputation-oriented and downstream task-oriented. For each of them, we point out their advantages and disadvantages. Moreover, we implement some representative methods and compare them on four medical datasets with two tasks. Finally, we discuss the challenges and opportunities in this area.

下载PDF全文

下载文献需遵守相关版权规定

论文标题