论文标题
混合物的复杂性及其在逐渐聚类变化检测中的应用
Mixture Complexity and Its Application to Gradual Clustering Change Detection
论文作者
论文摘要
在使用有限混合模型的基于模型的聚类中,确定簇数(群集大小)是一个重大挑战。它过去等于混合成分的数量(混合物尺寸)。但是,在存在重叠或重量偏见的情况下,这可能无效。在这项研究中,我们建议通过称为混合复杂性(MC)的新概念在混合模型中连续测量簇大小。它是从信息理论的角度正式定义的,可以看作是考虑重叠和重量偏置的群集大小的自然扩展。随后,我们将MC应用于逐渐聚类变化检测的问题。通常,聚类变化被认为是突然的,这是由混合物大小或簇大小的变化引起的。同时,我们认为按MC逐渐逐步进行聚类变化。它具有更早发现变化并辨别重大和微不足道的变化的好处。我们进一步证明,MC可以根据混合模型的层次结构进行分解。它有助于我们分析子结构的细节。
In model-based clustering using finite mixture models, it is a significant challenge to determine the number of clusters (cluster size). It used to be equal to the number of mixture components (mixture size); however, this may not be valid in the presence of overlaps or weight biases. In this study, we propose to continuously measure the cluster size in a mixture model by a new concept called mixture complexity (MC). It is formally defined from the viewpoint of information theory and can be seen as a natural extension of the cluster size considering overlap and weight bias. Subsequently, we apply MC to the issue of gradual clustering change detection. Conventionally, clustering changes has been considered to be abrupt, induced by the changes in the mixture size or cluster size. Meanwhile, we consider the clustering changes to be gradual in terms of MC; it has the benefits of finding the changes earlier and discerning the significant and insignificant changes. We further demonstrate that the MC can be decomposed according to the hierarchical structures of the mixture models; it helps us to analyze the detail of substructures.