论文标题

SIMCA:sindhorn矩阵分解具有容量限制

SiMCa: Sinkhorn Matrix Factorization with Capacity Constraints

论文作者

Daoud, Eric, Ganassali, Luca, Baker, Antoine, Lelarge, Marc

论文摘要

对于非常广泛的问题,在过去的十年中,推荐算法越来越多地使用。在大多数这些算法中,这些预测都是基于从物品和用户的高维嵌入中获得的用户项目亲和力分数构建的。在更复杂的情况下,具有几何或容量限制,基于嵌入的预测可能还不够,并且在算法的设计中应考虑一些其他特征。在这项工作中,我们在用户和项目之间的亲和力中研究了建议问题,既基于他们在潜在空间中的嵌入,又基于其基础欧几里得空间中的地理距离(例如,$ \ mathbb {r}^2 $),以及物品容量约束。该框架是由一些现实世界应用的激励,例如医疗保健:任务是根据患者的位置,病理和医院的能力向患者推荐医院。在这些应用程序中,用户和项目之间存在某种不对称性:项目被视为静态点,它们的嵌入,能力和位置,限制了分配。在观察最佳分配后,用户嵌入,物品能力及其在其基础欧几里得空间中的位置,我们的目标是恢复潜在空间中的项目嵌入;这样做,我们就可以使用此估计值,例如为了预测未来的分配。我们提出了一种基于矩阵分数的算法(SIMCA),并通过最佳的传输步骤来建模用户项目亲和力,并从观察到的数据中学习项目嵌入。然后,我们说明并讨论了这种医院建议有关合成数据的结果的结果。

For a very broad range of problems, recommendation algorithms have been increasingly used over the past decade. In most of these algorithms, the predictions are built upon user-item affinity scores which are obtained from high-dimensional embeddings of items and users. In more complex scenarios, with geometrical or capacity constraints, prediction based on embeddings may not be sufficient and some additional features should be considered in the design of the algorithm. In this work, we study the recommendation problem in the setting where affinities between users and items are based both on their embeddings in a latent space and on their geographical distance in their underlying euclidean space (e.g., $\mathbb{R}^2$), together with item capacity constraints. This framework is motivated by some real-world applications, for instance in healthcare: the task is to recommend hospitals to patients based on their location, pathology, and hospital capacities. In these applications, there is somewhat of an asymmetry between users and items: items are viewed as static points, their embeddings, capacities and locations constraining the allocation. Upon the observation of an optimal allocation, user embeddings, items capacities, and their positions in their underlying euclidean space, our aim is to recover item embeddings in the latent space; doing so, we are then able to use this estimate e.g. in order to predict future allocations. We propose an algorithm (SiMCa) based on matrix factorization enhanced with optimal transport steps to model user-item affinities and learn item embeddings from observed data. We then illustrate and discuss the results of such an approach for hospital recommendation on synthetic data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源