用于实现增量近端方法的有效算法

论文标题

用于实现增量近端方法的有效算法

Efficient algorithms for implementing incremental proximal-point methods

论文作者

Shtoff, Alex

论文摘要

在每个计算步骤中观察一小部分训练集的模型培训算法在实用的机器学习中无处不在，并包括随机和在线优化方法。在绝大多数情况下，这种算法通常通过成本功能的梯度观察训练样品。因此，这些方法利用是通过其一阶近似值的成本函数的斜率。为了解决基于梯度的方法的局限性，例如对随机环境中对阶梯尺寸选择的敏感性，或者无法在在线环境中使用小功能变异性，几种研究流的尝试试图利用有关成本函数的更多信息，而不是通过知名近端运营商来利用其梯度的更多信息。但是，实践中实施此类方法会提出一个挑战，因为每个迭代步骤都归结为计算近端操作员，这可能并不容易。在这项工作中，我们设计了一个新颖的算法框架，该框架利用凸双重性理论可以实现近端运算符实现的算法效率和软件模块化，以便通过对较大的研究人员和从事的研究人员进行研究，以对其进行研究，从而使他们的研究人员和从事的练习对其进行研究。我们为本文在https://github.com/alexshtf/inc_prox_pt/releases/tag/prox_pt_paper上开发的框架提供了参考Python实施，作为开源库，并证明了我们在各种问题上的实施，并重现了本文中的几个问题。纯Python参考实现不一定是最有效的，而是通过将Python与本机后端相结合来创建有效实现的基础。

Model training algorithms which observe a small portion of the training set in each computational step are ubiquitous in practical machine learning, and include both stochastic and online optimization methods. In the vast majority of cases, such algorithms typically observe the training samples via the gradients of the cost functions the samples incur. Thus, these methods exploit are the slope of the cost functions via their first-order approximations. To address limitations of gradient-based methods, such as sensitivity to step-size choice in the stochastic setting, or inability to use small function variability in the online setting, several streams of research attempt to exploit more information about the cost functions than just their gradients via the well-known proximal operators. However, implementing such methods in practice poses a challenge, since each iteration step boils down to computing the proximal operator, which may not be easy. In this work we devise a novel algorithmic framework, which exploits convex duality theory to achieve both algorithmic efficiency and software modularity of proximal operator implementations, in order to make experimentation with incremental proximal optimization algorithms accessible to a larger audience of researchers and practitioners, by reducing the gap between their theoretical description in research papers and their use in practice. We provide a reference Python implementation for the framework developed in this paper as an open source library at on https://github.com/alexshtf/inc_prox_pt/releases/tag/prox_pt_paper, along with examples which demonstrate our implementation on a variety of problems, and reproduce the numerical experiments in this paper. The pure Python reference implementation is not necessarily the most efficient, but is a basis for creating efficient implementations by combining Python with a native backend.

下载PDF全文

下载文献需遵守相关版权规定

论文标题