协作异常检测

论文标题

协作异常检测

Collaborative Anomaly Detection

论文作者

Bai, Ke, Zhang, Aonan, Li, Zhizhong, Heano, Ricardo, Wang, Chong, Carin, Lawrence

论文摘要

在推荐系统中，项目可能会接触到各种用户，我们想了解新用户对现有项目的熟悉程度。可以将其作为异常检测（AD）问题进行配置，该问题区分“普通用户”（名义）和“新用户”（异常）。考虑到物品的庞大数量和用户项目配对数据的稀疏性，在每个项目上独立应用传统的单任务检测方法很快就变得困难，而项目之间的相关性则被忽略。为了解决这个多任务异常检测问题，我们建议协作异常检测（CAD）共同学习所有任务，并通过任务之间的嵌入编码相关性来学习所有任务。我们通过有条件的密度估计和条件可能性比估计来探索CAD。我们发现：$ i $）估计似然比的学习效率更高，并且比密度估计更好。 $ ii $）提前选择少量任务以学习任务嵌入模型，然后使用它来启动所有任务嵌入是有益的。因此，这些嵌入可以捕获任务之间的相关性并概括为新的相关任务。

In recommendation systems, items are likely to be exposed to various users and we would like to learn about the familiarity of a new user with an existing item. This can be formulated as an anomaly detection (AD) problem distinguishing between "common users" (nominal) and "fresh users" (anomalous). Considering the sheer volume of items and the sparsity of user-item paired data, independently applying conventional single-task detection methods on each item quickly becomes difficult, while correlations between items are ignored. To address this multi-task anomaly detection problem, we propose collaborative anomaly detection (CAD) to jointly learn all tasks with an embedding encoding correlations among tasks. We explore CAD with conditional density estimation and conditional likelihood ratio estimation. We found that: $i$) estimating a likelihood ratio enjoys more efficient learning and yields better results than density estimation. $ii$) It is beneficial to select a small number of tasks in advance to learn a task embedding model, and then use it to warm-start all task embeddings. Consequently, these embeddings can capture correlations between tasks and generalize to new correlated tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题