论文标题

动态公平分配中统一的遗憾

Uniformly Bounded Regret in Dynamic Fair Allocation

论文作者

Balseiro, Santiago R., Xia, Shangzhou

论文摘要

我们研究了一个动态分配问题,其中$ t $依次到达可划分的资源应分配给具有线性实用程序的许多代理。每种资源的边际公用事业是从已知的共同分布中逐渐绘制的,该分布在跨时间独立且相同,并且中央计划者立即做出立即且不可撤销的分配决策。大多数在动态资源分配上的工作旨在最大化功利主义福利,即分配的效率,这可能会导致对某些高级现象的不公平资源集中,同时使其他人的需求不足。在本文中,旨在平衡效率和公平性,而是考虑了广泛的福利指标,即Hölder手段,其中包括NASH社会福利和平等福利。为此,我们首先研究了一种基于流畅的政策,该政策是从确定性的代孕问题到基础问题的,并表明,对于所有平稳的Hölder表示福利指标,它在时间范围内达到了$ O(1)$遗憾的$ t $ t $ t $ t $ taindsight fircimight of Shindsight Optimum,即,如果所有实用性都在所有实用性中都知道,则在所有实用性的情况下都可以预先确定Allsoccores necrecocations。但是,当根据非平滑平等福利进行评估时,基于流体的政策会感到遗憾的是$θ(\ sqrt {t})$。然后,我们提出了一个新的策略,构建了一个新的策略,称为向后不频繁地重新解决阈值($ \ MATHSF {birt} $),该重新解决了,它包括重新解决最多$ O(\ log \ log \ log \ log \ log t)$次的确定性代理问题。我们证明了$ \ Mathsf {birt} $策略在事后最佳的平均福利中获得了$ o(1)$遗憾,而与时间范围长度$ t $ t $无关。我们通过提出数值实验来证实我们的理论主张,并说明针对多种基准策略的显着绩效提高来得出结论。

We study a dynamic allocation problem in which $T$ sequentially arriving divisible resources are to be allocated to a number of agents with linear utilities. The marginal utilities of each resource to the agents are drawn stochastically from a known joint distribution, independently and identically across time, and the central planner makes immediate and irrevocable allocation decisions. Most works on dynamic resource allocation aim to maximize the utilitarian welfare, i.e., the efficiency of the allocation, which may result in unfair concentration of resources on certain high-utility agents while leaving others' demands under-fulfilled. In this paper, aiming at balancing efficiency and fairness, we instead consider a broad collection of welfare metrics, the Hölder means, which includes the Nash social welfare and the egalitarian welfare. To this end, we first study a fluid-based policy derived from a deterministic surrogate to the underlying problem and show that for all smooth Hölder mean welfare metrics it attains an $O(1)$ regret over the time horizon length $T$ against the hindsight optimum, i.e., the optimal welfare if all utilities were known in advance of deciding on allocations. However, when evaluated under the non-smooth egalitarian welfare, the fluid-based policy attains a regret of order $Θ(\sqrt{T})$. We then propose a new policy built thereupon, called Backward Infrequent Re-solving with Thresholding ($\mathsf{BIRT}$), which consists of re-solving the deterministic surrogate problem at most $O(\log\log T)$ times. We prove the $\mathsf{BIRT}$ policy attains an $O(1)$ regret against the hindsight optimal egalitarian welfare, independently of the time horizon length $T$. We conclude by presenting numerical experiments to corroborate our theoretical claims and to illustrate the significant performance improvement against several benchmark policies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源