论文标题
预算随机效用最大化的测试评分算法
Test Score Algorithms for Budgeted Stochastic Utility Maximization
论文作者
论文摘要
基于解决效用最大化问题的单个项目得分设计算法的最新发展,我们研究了使用测试分数的框架,该框架定义为观察到的单个项目绩效数据的统计数据,用于解决预算的随机效用最大化问题。我们扩展了现有的评分机制,即复制测试分数,以合并异质项目成本和项目值。我们表明,一种天然的贪婪算法,仅根据其复制测试分数输出解决方案在恒定因素的最佳因素中,该算法的最佳效应量。我们的算法和近似值保证假设测试得分是对单个项目值的边际分布的某些期望值的嘈杂估计值,从而使我们的算法实用,并且扩展了以前的工作,以假定噪音无噪声估计。此外,我们展示了如何将我们的算法适应以流式传输方式到达的设置,同时保持相同的近似保证。我们使用学术界的合成数据和数据集提出数值结果。STACKEXCHANGE问答论坛,该论坛表明我们的测试分数算法可以实现竞争力,并且在某些情况下比基准测试算法更好地性能,该基准算法需要访问值oracle才能评估功能值。
Motivated by recent developments in designing algorithms based on individual item scores for solving utility maximization problems, we study the framework of using test scores, defined as a statistic of observed individual item performance data, for solving the budgeted stochastic utility maximization problem. We extend an existing scoring mechanism, namely the replication test scores, to incorporate heterogeneous item costs as well as item values. We show that a natural greedy algorithm that selects items solely based on their replication test scores outputs solutions within a constant factor of the optimum for a broad class of utility functions. Our algorithms and approximation guarantees assume that test scores are noisy estimates of certain expected values with respect to marginal distributions of individual item values, thus making our algorithms practical and extending previous work that assumes noiseless estimates. Moreover, we show how our algorithm can be adapted to the setting where items arrive in a streaming fashion while maintaining the same approximation guarantee. We present numerical results, using synthetic data and data sets from the Academia.StackExchange Q&A forum, which show that our test score algorithm can achieve competitiveness, and in some cases better performance than a benchmark algorithm that requires access to a value oracle to evaluate function values.