论文标题
如何采样以及何时停止采样:广义WALD问题和最小政策
How to sample and when to stop sampling: The generalized Wald problem and minimax policies
论文作者
论文摘要
我们研究了采样昂贵的顺序实验,而决策者的目的是通过(1)在两种可能的治疗方法之间自适应地分配单位来确定全尺度实施的最佳处理,并且(2)在预期的福利(包括抽样成本)实施所选治疗方法时停止实验。在连续的时间限制下工作,我们表征了Minimax后悔标准下的最佳政策。我们表明,在参数结果和非参数结果分布中,相同的策略在采样成本接近零的渐近状态下也保持最佳状态。最小值最佳抽样规则只是Neyman的分配:它与采样成本无关,并且不适合观察到的结果。当平均治疗差异和观测值的乘积超过特定阈值时,决策者停止采样。得出的结果还适用于所谓的最佳武器识别问题,其中观察数是外源指定的。
We study sequential experiments where sampling is costly and a decision-maker aims to determine the best treatment for full scale implementation by (1) adaptively allocating units between two possible treatments, and (2) stopping the experiment when the expected welfare (inclusive of sampling costs) from implementing the chosen treatment is maximized. Working under a continuous time limit, we characterize the optimal policies under the minimax regret criterion. We show that the same policies also remain optimal under both parametric and non-parametric outcome distributions in an asymptotic regime where sampling costs approach zero. The minimax optimal sampling rule is just the Neyman allocation: it is independent of sampling costs and does not adapt to observed outcomes. The decision-maker halts sampling when the product of the average treatment difference and the number of observations surpasses a specific threshold. The results derived also apply to the so-called best-arm identification problem, where the number of observations is exogenously specified.