论文标题
快速贝叶斯核心通过亚采样和准Newton精炼
Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement
论文作者
论文摘要
贝叶斯核心通过构建数据点的一个较小的加权子集近似后验分布。任何在整个后部上运行的推理过程都可以在核心上廉价地运行,而在整个后部进行运行,其结果近似于完整数据上的结果。但是,当前方法受到大量运行时的限制,或者用户需要指定向完整后部的低成本近似值。我们提出了一种贝叶斯核心结构算法,该算法首先选择均匀的随机数据子集,然后使用新型的准Newton方法优化权重。我们的算法是一种易于实现的黑框方法,它不需要用户指定低成本后近似。它是第一个在输出核心后部的KL差异上带有一般高概率构成的。实验表明,我们的方法可与具有可比的施工时间的替代方案相比,核心质量有了显着改进,所需的存储成本和用户输入要少得多。
Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the user to specify a low-cost approximation to the full posterior. We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. Our algorithm is a simple to implement, black-box method, that does not require the user to specify a low-cost posterior approximation. It is the first to come with a general high-probability bound on the KL divergence of the output coreset posterior. Experiments demonstrate that our method provides significant improvements in coreset quality against alternatives with comparable construction times, with far less storage cost and user input required.