论文标题

通过隔离分布内核检测变化间隔

Detecting Change Intervals with Isolation Distributional Kernel

论文作者

Cao, Yang, Zhu, Ye, Ting, Kai Ming, Salim, Flora D., Li, Hong Xian, Yang, Luxing, Li, Gang

论文摘要

检测数据分布的突然变化是流数据分析中最重要的任务之一。尽管最近提出了许多无监督的更改点检测(CPD)方法来识别这些变化,但它们仍然遭受缺失的细微变化,较差的可伸缩性或/和/和/和对异常值的敏感性。为了应对这些挑战,我们是第一个将CPD问题概括为变形间隔检测(CID)问题的特殊情况的人。然后,我们根据最近的隔离分布内核(IDK)提出了一种名为ICID的CID方法。如果两个非均匀的时间间隔之间的差异分数高,则ICID会确定变化间隔。 IDK的数据依赖性属性和有限特征映射使ICID能够在数据流中有效识别各种类型的变更点,并具有异常值的耐受性。此外,ICID的在线和离线版本具有优化关键参数设置的能力。 ICID的有效性和效率已在合成数据集和现实数据集上进行了系统的验证。

Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitivity to outliers. To meet these challenges, we are the first to generalise the CPD problem as a special case of the Change-Interval Detection (CID) problem. Then we propose a CID method, named iCID, based on a recent Isolation Distributional Kernel (IDK). iCID identifies the change interval if there is a high dissimilarity score between two non-homogeneous temporal adjacent intervals. The data-dependent property and finite feature map of IDK enabled iCID to efficiently identify various types of change-points in data streams with the tolerance of outliers. Moreover, the proposed online and offline versions of iCID have the ability to optimise key parameter settings. The effectiveness and efficiency of iCID have been systematically verified on both synthetic and real-world datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源