论文标题
从点击到转换:建议长期奖励
From Clicks to Conversions: Recommendation for long-term reward
论文作者
论文摘要
建议系统经常被优化以用于短期奖励:如果可以在建议后立即观察到奖励(例如单击),则认为建议将其视为成功。该框架的优点是,有了一些合理的(尽管有问题)的假设,它允许将熟悉的监督学习工具用于推荐任务。但是,这意味着长期业务指标,例如销售或保留被忽略。在本文中,我们引入了一个框架,用于在重新模拟环境中建模长期奖励。在转换优化的建议的情况下,我们使用这种新引入的功能来展示由最后点击归因方案引入的问题,并提出了一个简单的扩展,从而导致最新结果。
Recommender systems are often optimised for short-term reward: a recommendation is considered successful if a reward (e.g. a click) can be observed immediately after the recommendation. The advantage of this framework is that with some reasonable (although questionable) assumptions, it allows familiar supervised learning tools to be used for the recommendation task. However, it means that long-term business metrics, e.g. sales or retention are ignored. In this paper we introduce a framework for modeling long-term rewards in the RecoGym simulation environment. We use this newly introduced functionality to showcase problems introduced by the last-click attribution scheme in the case of conversion-optimized recommendations and propose a simple extension that leads to state-of-the-art results.