无监督时间序列离群值检测的强大且可解释的自动编码器---扩展版本

论文标题

无监督时间序列离群值检测的强大且可解释的自动编码器---扩展版本

Robust and Explainable Autoencoders for Unsupervised Time Series Outlier Detection---Extended Version

论文作者

Kieu, Tung, Yang, Bin, Guo, Chenjuan, Jensen, Christian S., Zhao, Yan, Huang, Feiteng, Zheng, Kai

论文摘要

时间序列数据广泛发生，离群值检测是数据挖掘中的一个基本问题，该数据具有许多应用。现有的基于自动编码器的方法在具有挑战性的现实数据方面提供了最先进的性能，但很容易受到异常值的影响，并且表现出较低的解释性。为了解决这两个局限性，我们提出了可解释的无监督自动编码器框架，将输入时间序列分解为干净的时间序列，并使用自动编码器分解了一个较高的时间序列。可以提高解释性，因为可以通过易于理解的模式（例如趋势和周期）来更好地解释清洁时间序列。我们通过事后解释性分析和经验研究来洞悉这一点。此外，由于离群值与干净的时间序列相距迭代，我们的方法可以提高异常值的鲁棒性，从而提高了准确性。我们在五个现实世界数据集上评估了我们的方法，并在鲁棒性和解释性方面报告了对最先进方法的改进。这是出现在IEEE ICDE 2022中的“无监督时间序列异常值检测的强大且可解释的自动编码器”的扩展版本。

Time series data occurs widely, and outlier detection is a fundamental problem in data mining, which has numerous applications. Existing autoencoder-based approaches deliver state-of-the-art performance on challenging real-world data but are vulnerable to outliers and exhibit low explainability. To address these two limitations, we propose robust and explainable unsupervised autoencoder frameworks that decompose an input time series into a clean time series and an outlier time series using autoencoders. Improved explainability is achieved because clean time series are better explained with easy-to-understand patterns such as trends and periodicities. We provide insight into this by means of a post-hoc explainability analysis and empirical studies. In addition, since outliers are separated from clean time series iteratively, our approach offers improved robustness to outliers, which in turn improves accuracy. We evaluate our approach on five real-world datasets and report improvements over the state-of-the-art approaches in terms of robustness and explainability. This is an extended version of "Robust and Explainable Autoencoders for Unsupervised Time Series Outlier Detection", to appear in IEEE ICDE 2022.

下载PDF全文

下载文献需遵守相关版权规定

论文标题