piv_MCMC: JAGS/Stan Sampling for Gaussian Mixture Models and Clustering...

View source: R/bayes_mcmc.R

piv_MCMCR Documentation

JAGS/Stan Sampling for Gaussian Mixture Models and Clustering via Co-Association Matrix.

Description

Perform MCMC JAGS sampling or HMC Stan sampling for Gaussian mixture models, post-process the chains and apply a clustering technique to the MCMC sample. Pivotal units for each group are selected among four alternative criteria.

Usage

piv_MCMC(
  y,
  k,
  nMC,
  priors,
  piv.criterion = c("MUS", "maxsumint", "minsumnoint", "maxsumdiff"),
  clustering = c("diana", "hclust"),
  software = c("rjags", "rstan"),
  burn = 0.5 * nMC,
  chains = 4,
  cores = 1,
  sparsity = FALSE
)

Arguments

y

N-dimensional vector for univariate data or N \times D matrix for multivariate data.

k

Number of mixture components.

nMC

Number of MCMC iterations for the JAGS/Stan function execution.

priors

Input prior hyperparameters (see Details for default options).

piv.criterion

The pivotal criterion used for identifying one pivot for each group. Possible choices are: "MUS", "maxsumint", "minsumnoint", "maxsumdiff". The default method is "maxsumint" (see the Details and the vignette).

clustering

The algorithm adopted for partitioning the N observations into k groups. Possible choices are "diana" (default) or "hclust" for divisive and agglomerative hierarchical clustering, respectively.

software

The selected MCMC method to fit the model: "rjags" for the JAGS method, "rstan" for the Stan method. Default is "rjags".

burn

The burn-in period (only if method "rjags" is selected). Default is 0.5\times nMC.

chains

A positive integer specifying the number of Markov chains. The default is 4.

cores

The number of cores to use when executing the Markov chains in parallel (only if software="rstan"). Default is 1.

sparsity

Allows for sparse finite mixtures, default is FALSE.

Details

The function fits univariate and multivariate Bayesian Gaussian mixture models of the form (here for univariate only):

(Y_i|Z_i=j) \sim \mathcal{N}(μ_j,σ_j),

where the Z_i, i=1,…,N, are i.i.d. random variables, j=1,…,k, σ_j is the group variance, Z_i \in {1,…,k } are the latent group allocation, and

P(Z_i=j)=η_j.

The likelihood of the model is then

L(y;μ,η,σ) = ∏_{i=1}^N ∑_{j=1}^k η_j \mathcal{N}(μ_j,σ_j),

where (μ, σ)=(μ_{1},…,μ_{k},σ_{1},…,σ_{k}) are the component-specific parameters and η=(η_{1},…,η_{k}) the mixture weights. Let ν denote a permutation of { 1,…,k }, and let ν(μ)= (μ_{ν(1)},…, μ_{ν(k)}), ν(σ)= (σ_{ν(1)},…, σ_{ν(k)}), ν(η)=(η_{ν(1)},…,η_{ν(k)}) be the corresponding permutations of μ, σ and η. Denote by V the set of all the permutations of the indexes {1,…,k }, the likelihood above is invariant under any permutation ν \in V, that is

L(y;μ,η,σ) = L(y;ν(μ),ν(η),ν(σ)).

As a consequence, the model is unidentified with respect to an arbitrary permutation of the labels. When Bayesian inference for the model is performed, if the prior distribution p_0(μ,η,σ) is invariant under a permutation of the indices, then so is the posterior. That is, if p_0(μ,η,σ) = p_0(ν(μ),ν(η),σ), then

p(μ,η,σ| y) \propto p_0(μ,η,σ)L(y;μ,η,σ)

is multimodal with (at least) k! modes.

Depending on the selected software, the model parametrization changes in terms of the prior choices. Precisely, the JAGS philosophy with the underlying Gibbs sampling is to use noninformative priors, and conjugate priors are preferred for computational speed. Conversely, Stan adopts weakly informative priors, with no need to explicitly use the conjugacy. For univariate mixtures, when software="rjags" the specification is the same as the function BMMmodel of the bayesmix package:

μ_j \sim \mathcal{N}(μ_0, 1/B0inv)

σ_j \sim \mbox{invGamma}(nu0Half, nu0S0Half)

η \sim \mbox{Dirichlet}(1,…,1)

S0 \sim \mbox{Gamma}(g0Half, g0G0Half),

with default values: μ_0=0, B0inv=0.1, nu0Half =10, S0=2, nu0S0Half= nu0Half\times S0, g0Half = 5e-17, g0G0Half = 5e-33, in accordance with the default specification:

priors=list(kind = "independence", parameter = "priorsFish", hierarchical = "tau")

(see bayesmix for further details and choices).

When software="rstan", the prior specification is:

μ_j \sim \mathcal{N}(μ_0, 1/B0inv)

σ_j \sim \mbox{Lognormal}(μ_{σ}, τ_{σ})

η_j \sim \mbox{Uniform}(0,1),

with default values: μ_0=0, B0inv=0.1, μ_{σ}=0, τ_{σ}=2. The users may specify new hyperparameter values with the argument:

priors=list(mu_0=1, B0inv=0.2, mu_sigma=3, tau_sigma=5)

For multivariate mixtures, when software="rjags" the prior specification is the following:

\bm{μ}_j \sim \mathcal{N}_D(\bm{μ}_0, S2)

Σ^{-1} \sim \mbox{Wishart}(S3, D+1)

η \sim \mbox{Dirichlet}(\bm{α}),

where \bm{α} is a k-dimensional vector and S_2 and S_3 are positive definite matrices. By default, \bm{μ}_0=\bm{0}, \bm{α}=(1,…,1) and S_2 and S_3 are diagonal matrices, with diagonal elements equal to 1e+05. The user may specify other values for the hyperparameters \bm{μ}_0, S_2, S_3 and \bm{α} via priors argument in such a way:

priors =list(mu_0 = c(1,1), S2 = ..., S3 = ..., alpha = ...)

with the constraint for S2 and S3 to be positive definite, and \bm{α} a vector of dimension k with nonnegative elements.

When software="rstan", the prior specification is:

\bm{μ}_j \sim \mathcal{N}_D(\bm{μ}_0, LD*L^{T})

L \sim \mbox{LKJ}(ε)

D^*_j \sim \mbox{HalfCauchy}(0, σ_d).

The covariance matrix is expressed in terms of the LDL decomposition as LD*L^{T}, a variant of the classical Cholesky decomposition, where L is a D \times D lower unit triangular matrix and D* is a D \times D diagonal matrix. The Cholesky correlation factor L is assigned a LKJ prior with ε degrees of freedom, which, combined with priors on the standard deviations of each component, induces a prior on the covariance matrix; as ε \rightarrow ∞ the magnitude of correlations between components decreases, whereas ε=1 leads to a uniform prior distribution for L. By default, the hyperparameters are \bm{μ}_0=\bm{0}, σ_d=2.5, ε=1. The user may propose some different values with the argument:

priors=list(mu_0=c(1,2), sigma_d = 4, epsilon =2)

If software="rjags" the function performs JAGS sampling using the bayesmix package for univariate Gaussian mixtures, and the runjags package for multivariate Gaussian mixtures. If software="rstan" the function performs Hamiltonian Monte Carlo (HMC) sampling via the rstan package (see the vignette and the Stan project for any help).

After MCMC sampling, this function clusters the units in k groups, calls the piv_sel() function and yields the pivots obtained from one among four different methods (the user may specify one among them via piv.criterion argument): "maxsumint", "minsumnoint", "maxsumdiff" and "MUS" (available only if k <= 4) (see the vignette for thorough details). Due to computational reasons clarified in the Details section of the function piv_rel, the length of the MCMC chains will be minor or equal than the input argument nMC; this length, corresponding to the value true.iter returned by the procedure, is the number of MCMC iterations for which the number of JAGS/Stan groups exactly coincides with the prespecified number of groups k.

Value

The function gives the MCMC output, the clustering solutions and the pivotal indexes. Here there is a complete list of outputs.

true.iter

The number of MCMC iterations for which the number of JAGS/Stan groups exactly coincides with the prespecified number of groups k.

Mu

An estimate of the groups' means.

groupPost

true.iter \times N matrix with values from 1:k indicating the post-processed group allocation vector.

mcmc_mean

If y is a vector, a true.iter \times k matrix with the post-processed MCMC chains for the mean parameters; if y is a matrix, a true.iter \times D \times k array with the post-processed MCMC chains for the mean parameters.

mcmc_sd

If y is a vector, a true.iter \times k matrix with the post-processed MCMC chains for the sd parameters; if y is a matrix, a true.iter \times D array with the post-processed MCMC chains for the sd parameters.

mcmc_weight

A true.iter \times k matrix with the post-processed MCMC chains for the weights parameters.

mcmc_mean_raw

If y is a vector, a (nMC-burn) \times k matrix with the raw MCMC chains for the mean parameters as given by JAGS; if y is a matrix, a (nMC-burn) \times D \times k array with the raw MCMC chains for the mean parameters as given by JAGS/Stan.

mcmc_sd_raw

If y is a vector, a (nMC-burn) \times k matrix with the raw MCMC chains for the sd parameters as given by JAGS/Stan; if y is a matrix, a (nMC-burn) \times D array with the raw MCMC chains for the sd parameters as given by JAGS/Stan.

mcmc_weight_raw

A (nMC-burn) \times k matrix with the raw MCMC chains for the weights parameters as given by JAGS/Stan.

C

The N \times N co-association matrix constructed from the MCMC sample.

grr

The vector of cluster membership returned by "diana" or "hclust".

pivots

The vector of indices of pivotal units identified by the selected pivotal criterion.

model

The JAGS/Stan model code. Apply the "cat" function for a nice visualization of the code.

stanfit

An object of S4 class stanfit for the fitted model (only if software="rstan").

Author(s)

Leonardo Egidi legidi@units.it

References

Egidi, L., Pappadà, R., Pauli, F. and Torelli, N. (2018). Relabelling in Bayesian Mixture Models by Pivotal Units. Statistics and Computing, 28(4), 957-969.

Examples


### Bivariate simulation

## Not run: 
N   <- 200
k   <- 4
D   <- 2
nMC <- 1000
M1  <- c(-.5,8)
M2  <- c(25.5,.1)
M3  <- c(49.5,8)
M4  <- c(63.0,.1)
Mu  <- rbind(M1,M2,M3,M4)
Sigma.p1 <- diag(D)
Sigma.p2 <- 20*diag(D)
W <- c(0.2,0.8)
sim <- piv_sim(N = N, k = k, Mu = Mu,
               Sigma.p1 = Sigma.p1,
               Sigma.p2 = Sigma.p2, W = W)

## rjags (default)
res <- piv_MCMC(y = sim$y, k =k, nMC = nMC)

## rstan
res_stan <- piv_MCMC(y = sim$y, k =k, nMC = nMC,
                     software ="rstan")

# changing priors
res2 <- piv_MCMC(y = sim$y,
                 priors = list (
                 mu_0=c(1,1),
                 S2 = matrix(c(0.002,0,0, 0.1),2,2, byrow=TRUE),
                 S3 = matrix(c(0.1,0,0,0.1), 2,2, byrow =TRUE)),
                 k = k, nMC = nMC)

## End(Not run)


### Fishery data (bayesmix package)

## Not run: 
library(bayesmix)
data(fish)
y <- fish[,1]
k <- 5
nMC <- 5000
res <- piv_MCMC(y = y, k = k, nMC = nMC)

# changing priors
res2   <- piv_MCMC(y = y,
                   priors = list(kind = "condconjugate",
                   parameter = "priorsRaftery",
                   hierarchical = "tau"),  k =k, nMC = nMC)

## End(Not run)

pivmet documentation built on March 7, 2023, 6:34 p.m.