论文标题

利用基于得分的决斗匪徒来利用TOP-K选择

Exploiting Transitivity for Top-k Selection with Score-Based Dueling Bandits

论文作者

Groves, Matthew, Branke, Juergen

论文摘要

我们考虑了带有分数信息的决斗匪徒问题中TOP-K子集选择的问题。现实世界成对的排名问题通常表现出高度的传递性,并且先前的工作提出了采样方法,通过使用诸如Bradley-Terry-luce(BTL)和Thurstone模型(Bradley-Terry-terry-luce(BTL)和Thurstone)的参数模型来利用这种传递性。迄今为止,这项工作集中在样本结果是获胜/损失二进制响应的情况下。我们将其扩展到选择问题,其中采样结果通过提出瑟斯顿风格模型包含定量信息,并适应了子集选择(POCBAM)采样方法的成对最佳计算预算分配,以利用此模型以进行有效的样品选择。我们将经验性能与标准POCBAM和其他竞争算法进行比较。

We consider the problem of top-k subset selection in Dueling Bandit problems with score information. Real-world pairwise ranking problems often exhibit a high degree of transitivity and prior work has suggested sampling methods that exploit such transitivity through the use of parametric preference models like the Bradley-Terry-Luce (BTL) and Thurstone models. To date, this work has focused on cases where sample outcomes are win/loss binary responses. We extend this to selection problems where sampling results contain quantitative information by proposing a Thurstonian style model and adapting the Pairwise Optimal Computing Budget Allocation for subset selection (POCBAm) sampling method to exploit this model for efficient sample selection. We compare the empirical performance against standard POCBAm and other competing algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源