论文标题
层次差异过程和相对熵
Hierarchical Dirichlet Process and Relative Entropy
论文作者
论文摘要
分层迪里奇过程是一种离散的随机度量,是贝叶斯非参数的重要先验。它是通过研究群集数据组的动机。每个组都是通过两个级别的dirichlet过程进行建模的,所有组共享相同的基本分布本身是从一个级别的dirichlet过程中汲取的。它具有两个浓度参数,每个级别都有一个。本文的主要结果是当两个浓度参数融合到无穷大时,层次差异过程的大量和较大偏差及其质量。明确识别较大的偏差率函数。分层dirichlet过程的速率函数由两个术语组成,与每个级别的相对熵相对应。它小于Dirichlet过程的速率函数,这反映了以下事实:层次Dirichlet过程下的集群数量的增长率比Dirichlet过程较慢。
The Hierarchical Dirichlet process is a discrete random measure serving as an important prior in Bayesian non-parametrics. It is motivated with the study of groups of clustered data. Each group is modelled through a level two Dirichlet process and all groups share the same base distribution which itself is a drawn from a level one Dirichlet process. It has two concentration parameters with one at each level. The main results of the paper are the law of large numbers and large deviations for the hierarchical Dirichlet process and its mass when both concentration parameters converge to infinity. The large deviation rate functions are identified explicitly. The rate function for the hierarchical Dirichlet process consists of two terms corresponding to the relative entropies at each level. It is less than the rate function for the Dirichlet process, which reflects the fact that the number of clusters under the hierarchical Dirichlet process has a slower growth rate than under the Dirichlet process.