论文标题
通用缓存
Universal Caching
论文作者
论文摘要
在学习理论中,通常根据静态遗憾度量来衡量在线政策的绩效,这比较了在线政策的累积损失与事后观察中最佳基准的累积损失。在静态遗憾的定义中,基准策略的行动在整个时间范围内保持固定。自然,在固定的行动通常遭受性能不佳的情况下,导致的遗憾界限变得松散。在本文中,我们调查了在线缓存中更强烈的遗憾最小化概念。特别是,我们允许任何一轮基准的行动由包含任何数量状态的有限状态机决定。流行的缓存政策,例如LRU和FIFO,属于此类。利用信息理论的通用预测文献中的思想,我们提出了一种有效的在线缓存策略,并具有次线性遗憾。据我们所知,这是在通用环境中以缓存问题而闻名的第一个与数据相关的遗憾。我们通过将最近提供的在线缓存政策与逐步解析算法(即Lempel-Ziv '78)相结合来确定此结果。我们的方法还产生了更简单的学习理论证明,证明了与早期作品中使用的涉及特定问题的组合论证相比,遗憾的改善。
In learning theory, the performance of an online policy is commonly measured in terms of the static regret metric, which compares the cumulative loss of an online policy to that of an optimal benchmark in hindsight. In the definition of static regret, the action of the benchmark policy remains fixed throughout the time horizon. Naturally, the resulting regret bounds become loose in non-stationary settings where fixed actions often suffer from poor performance. In this paper, we investigate a stronger notion of regret minimization in the context of online caching. In particular, we allow the action of the benchmark at any round to be decided by a finite state machine containing any number of states. Popular caching policies, such as LRU and FIFO, belong to this class. Using ideas from the universal prediction literature in information theory, we propose an efficient online caching policy with a sub-linear regret bound. To the best of our knowledge, this is the first data-dependent regret bound known for the caching problem in the universal setting. We establish this result by combining a recently-proposed online caching policy with an incremental parsing algorithm, namely Lempel-Ziv '78. Our methods also yield a simpler learning-theoretic proof of the improved regret bound as opposed to the involved problem-specific combinatorial arguments used in the earlier works.