设备上的Next-项目推荐，并提供自我监督的知识蒸馏

论文标题

设备上的Next-项目推荐，并提供自我监督的知识蒸馏

On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation

论文作者

Xia, Xin, Yin, Hongzhi, Yu, Junliang, Wang, Qinyong, Xu, Guandong, Hung, Nguyen Quoc Viet

论文摘要

现代推荐系统以完全基于服务器的方式运行。要迎合数百万用户，需要维护和同时使用的用户请求的高速处理，这是以巨大的碳足迹为代价的。同时，用户即使将直接环境环境在内的服务器上传到服务器，也需要上传其行为数据，从而引起了公众对隐私的关注。在设备推荐系统中，系统将这两个问题与具有成本意识的环境和本地推理绕过。但是，由于内存和计算资源有限，因此推荐系统面临两个基本挑战：（1）如何减少适合边缘设备的常规型号的大小？（2）如何保留原始容量？先前的研究主要采用张量分解技术来压缩常规推荐模型的压缩比有限，以避免性能急剧下降。在本文中，我们通过失去张量分解中的维度一致性的约束来探索用于下一项建议的超紧凑模型。同时，为了弥补压缩导致的容量损失，我们开发了一个自制的知识蒸馏框架，使压缩模型（学生）可以提炼出原始数据中的基本信息，并通过与原始模型（教师）（教师）的嵌入式成熟策略来改善长尾项目的建议。在两个基准上进行的广泛实验表明，随着30倍模型尺寸的减小，压缩模型几乎没有准确的损失，甚至在大多数情况下都胜过其未压缩的对应物。

Modern recommender systems operate in a fully server-based fashion. To cater to millions of users, the frequent model maintaining and the high-speed processing for concurrent user requests are required, which comes at the cost of a huge carbon footprint. Meanwhile, users need to upload their behavior data even including the immediate environmental context to the server, raising the public concern about privacy. On-device recommender systems circumvent these two issues with cost-conscious settings and local inference. However, due to the limited memory and computing resources, on-device recommender systems are confronted with two fundamental challenges: (1) how to reduce the size of regular models to fit edge devices? (2) how to retain the original capacity? Previous research mostly adopts tensor decomposition techniques to compress the regular recommendation model with limited compression ratio so as to avoid drastic performance degradation. In this paper, we explore ultra-compact models for next-item recommendation, by loosing the constraint of dimensionality consistency in tensor decomposition. Meanwhile, to compensate for the capacity loss caused by compression, we develop a self-supervised knowledge distillation framework which enables the compressed model (student) to distill the essential information lying in the raw data, and improves the long-tail item recommendation through an embedding-recombination strategy with the original model (teacher). The extensive experiments on two benchmarks demonstrate that, with 30x model size reduction, the compressed model almost comes with no accuracy loss, and even outperforms its uncompressed counterpart in most cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题