通用的Gumbel-Softmax梯度估计器，用于通用离散随机变量

论文标题

通用的Gumbel-Softmax梯度估计器，用于通用离散随机变量

Generalized Gumbel-Softmax Gradient Estimator for Generic Discrete Random Variables

论文作者

Joo, Weonyoung, Kim, Dongjun, Shin, Seungjae, Moon, Il-Chul

论文摘要

估计随机计算图中随机节点的梯度是深层生成建模社区中的关键研究问题之一，它可以对神经网络参数进行梯度下降优化。广泛探索了离散随机变量的随机梯度估计器，例如，用于Bernoulli和分类分布的Gumbel-SoftMax重新聚集技巧。同时，尚未探索其他离散的分布案例，例如泊松，几何，二项式，多项式，负二项式等。本文提出了Gumbel-Softmax估计器的广义版本，该版本能够重新聚集通用离散分布，而不仅限于Bernoulli和分类。提出的估计器利用离散随机变量的截断，gumbel-softmax技巧和一种特殊的线性转换形式。我们的实验包括（1）在VAE上的合成示例和应用，这些示例和应用显示了我们方法的功效；（2）主题模型，这些模型证明了实践中提出的估计的价值。

Estimating the gradients of stochastic nodes in stochastic computational graphs is one of the crucial research questions in the deep generative modeling community, which enables the gradient descent optimization on neural network parameters. Stochastic gradient estimators of discrete random variables are widely explored, for example, Gumbel-Softmax reparameterization trick for Bernoulli and categorical distributions. Meanwhile, other discrete distribution cases such as the Poisson, geometric, binomial, multinomial, negative binomial, etc. have not been explored. This paper proposes a generalized version of the Gumbel-Softmax estimator, which is able to reparameterize generic discrete distributions, not restricted to the Bernoulli and the categorical. The proposed estimator utilizes the truncation of discrete random variables, the Gumbel-Softmax trick, and a special form of linear transformation. Our experiments consist of (1) synthetic examples and applications on VAE, which show the efficacy of our methods; and (2) topic models, which demonstrate the value of the proposed estimation in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题