论文标题
比较异常检测器:上下文很重要
Comparison of Anomaly Detectors: Context Matters
论文作者
论文摘要
如今,深层生成模型正在挑战当今检测领域的经典方法。每种新方法都提供表现优于其前辈的证据,通常会带有矛盾的结果。该比较的目的是双重的:将各种范式的异常检测方法与重点放在深层生成模型上,并鉴定可产生不同结果的可变性来源。在流行的表格和图像数据集上比较了这些方法。我们将可变性的主要来源确定为实验条件:i)类型数据集(表格或图像)和异常的性质(统计或语义),以及ii)选择超参数的策略,尤其是验证集中可用异常的数量。不同的方法在不同的情况下表现最好,即实验条件与计算时间的组合。这解释了先前结果的可变性,并强调了在新方法发布中仔细规范上下文的重要性。我们所有的代码和结果均可下载。
Deep generative models are challenging the classical methods in the field of anomaly detection nowadays. Every new method provides evidence of outperforming its predecessors, often with contradictory results. The objective of this comparison is twofold: to compare anomaly detection methods of various paradigms with focus on deep generative models, and identification of sources of variability that can yield different results. The methods were compared on popular tabular and image datasets. We identified the main sources of variability to be experimental conditions: i) the type data set (tabular or image) and the nature of anomalies (statistical or semantic), and ii) strategy of selection of hyperparameters, especially the number of available anomalies in the validation set. Different methods perform the best in different contexts, i.e. combination of experimental conditions together with computational time. This explains the variability of the previous results and highlights the importance of careful specification of the context in the publication of a new method. All our code and results are available for download.