identifyMixture: Solve label switching and identify mixture.

View source: R/identifyMixture.R

identifyMixtureR Documentation

Solve label switching and identify mixture.

Description

Clustering of the draws in the point process representation (PPR) using k-means clustering.

Usage

identifyMixture(Func, Mu, Eta, S, centers)

Arguments

Func

A numeric array of dimension M \times d \times K; data for clustering in the PPR.

Mu

A numeric array of dimension M \times r \times K; draws of cluster means.

Eta

A numeric array of dimension M \times K; draws of cluster sizes.

S

A numeric matrix of dimension M \times N; draws of cluster assignments.

centers

An integer or a numeric matrix of dimension K \times d; used to initialize stats::kmeans().

Details

The following steps are implemented:

  • A functional of the draws of the component-specific parameters (Func) is passed to the function. The functionals of each component and iteration are stacked on top of each other in order to obtain a matrix where each row corresponds to the functional of one component.

  • The functionals are clustered into K_+ clusters using k-means clustering. For each functional a group label is obtained.

  • The obtained labels of the functionals are used to construct a classification for each MCMC iteration. Those classifications which are a permutation of (1,\ldots,K_+) are used to reorder the Mu and Eta draws and the assignment matrix S. This results in an identified mixture model.

  • Note that only iterations resulting in permutations are used for parameter estimation and deriving the final partition. Those MCMC iterations where the obtained classifications of the functionals are not a permutation of (1,\ldots,K_+) are discarded as no unique assignment of functionals to components can be made. If the non-permutation rate, i.e. the proportion of MCMC iterations where the obtained classifications of the functionals are not a permutation, is high, this is an indication of a poor clustering solution, as the functionals are not clearly separated.

Value

A named list containing:

  • "S": reordered assignments.

  • "Mu": reordered Mu matrix.

  • "Eta": reordered weights.

  • "non_perm_rate": proportion of draws where the clustering did not result in a permutation and hence no relabeling could be performed; this is the proportion of draws discarded.


telescope documentation built on April 4, 2025, 2:40 a.m.