label.switching: Main calling function
In label.switching: Relabelling MCMC Outputs of Mixture Models

Description Usage Arguments Value Note Author(s) References See Also Examples

This is the main function of the package. It is used to reorder a simulated MCMC sample of the parameters of a mixture (or more general a hidden Markov model) according to eight label switching solving methods: ECR algorithm (default version), ECR algorithm (two iterative versions), PRA algorithm, Stephens' algorithm, Artificial Identifiability Constraint (AIC), Data-Based relabelling and a probabilistic relabelling algorithm (SJW). The input depends on the type of the label switching method. The output contains a list with the permutation returned by each method, the corresponding single best clusterings and the CPU time demanded for each method. In what follows: m denotes the number of MCMC iterations, n denotes the sample size of the observed data, K denotes the number of mixture components and J the number of different types of parameters of the model.

label.switching(method, zpivot, z, K, prapivot, p, complete, 
			mcmc, sjwinit, data, constraint, 
			groundTruth, thrECR, thrSTE, thrSJW, 
			maxECR, maxSTE, maxSJW, userPerm)

`method`	any non-empty subset of c("ECR","ECR-ITERATIVE-1","PRA","ECR-ITERATIVE-2","STEPHENS","SJW","AIC","DATA-BASED") indicating the desired label-switching solving method. Also available is the option "USER-PERM" which corresponds to a user-defined set of permutations `userPerm`.
`zpivot`	d\times n-dimensional array of pivot allocation vectors, where d denotes the number of pivots. This is demanded by the `ecr` method. The method will be applied d times.
`z`	m\times n integer array of the latent allocation vectors generated from an MCMC algorithm.
`K`	the number of mixture components. This is demanded by the `ecr`, `ecr.iterative.1` and `ecr.iterative.2` methods.
`prapivot`	K\times J array containing the parameter that will be used as a pivot by the `pra` method.
`p`	m\times n \times K dimensional array of allocation probabilities of the n observations among the K mixture components, for each iteration t = 1,…,m of the MCMC algorithm. This is demanded by the `ecr.iterative.2` and `stephens` methods.
`complete`	function that returns the complete log-likelihood of the mixture model. Demanded by `sjw` method.
`mcmc`	m\times K\times J array of simulated MCMC parameters. Needed by `sjw` and `pra` methods.
`sjwinit`	An index pointing at the MCMC iteration whose parameters will initialize the `sjw` algorithm (optional).
`data`	n-dimensional data vector/array. Needed by the `sjw` and `dataBased` algorithms.
`constraint`	An (optional) integer between 1 and J corresponding to the parameter that will be used to apply the Identifiabiality Constraint. In this casethe mcmc output is reordered according to the constraint mcmc[i,1,constraint] < … < mcmc[i,K,constraint]. If `constraint = "ALL"`, all J Identifiability Constraints are applied. Default value: 1.
`groundTruth`	Optional integer vector of n allocations, which are considered as the 'ground truth' allocations of the n observations among the K mixture components. The output of all methods will be relabelled in a way that the resulting single best clusterings maximize their similarity with the ground truth. This option is very useful in simulation studies or in any other case that the cluster labels are known in order to perform comparisons between methods.
`thrECR`	An (optional) positive number controlling the convergence criterion for `ecr.iterative.1` and `ecr.iterative.2`. Default value: 1e-6.
`thrSTE`	An (optional) positive number controlling the convergence criterion for `stephens`. Default value: 1e-6.
`thrSJW`	An (optional) positive number controlling the convergence criterion for `sjw`. Default value: 1e-6.
`maxECR`	An (optional) integer controlling the max number of iterations for `ecr.iterative.1` and `ecr.iterative.2`. Default value: 100.
`maxSTE`	An (optional) integer controlling the max number of iterations for `stephens`. Default value: 100.
`maxSJW`	An (optional) integer controlling the max number of iterations for `sjw`. Default value: 100.
`userPerm`	An (optional) list with user-defined permutations. It is required only if "USER-PERM" has been chosen in `method`. In this case, `userPerm[[i]]` is an m\times K array of permutations for all i = 1,…,S, where S denotes the number of permutation arrays. This is useful in case that the user wants to compare his/hers own relabelling method with the available ones.

`permutations`	an m\times K array of permutations per method.
`clusters`	an n dimensional vector of best clustering of the the observations for each method.
`timings`	CPU time needed for each relabelling method.
`similarity`	correlation matrix between the label switching solving methods in terms of their matching best-clustering allocations.

If the ground truth is not given, all methods are reordered using the estimated single best clustering of the first provided method. The methods sjw and pra are not suggested for large number of components. Also note that sjw might be quite slow even for small number of components. In this case try adjusting thrSJW or maxSJW to smaller values the default ones.

Panagiotis Papastamoulis

Papastamoulis P. (2016). label.switching: An R Package for Dealing with the Label Switching Problem in MCMC Outputs. Journal of Statistical Software, Code Snippets, 69(1): 1-24.

ecr, ecr.iterative.1, ecr.iterative.2, stephens, pra, sjw, dataBased, aic

# We will apply the following methods:
# ECR, ECR-ITERATIVE-1, PRA, AIC and DATA-BASED.
# default ECR will use two different pivots.

#load a toy example: MCMC output consists of the random beta model
# applied to a normal mixture of \code{K=2} components. The number of
# observations is equal to \code{n=5}. The number of MCMC samples is
# equal to \code{m=300}. simulated allocations are stored to array \code{z}. 
data("mcmc_output")
mcmc.pars<-data_list$"mcmc.pars"
# mcmc parameters are stored to array \code{mcmc.pars}
# mcmc.pars[,,1]: simulated means of the two components
# mcmc.pars[,,2]: simulated variances 
# mcmc.pars[,,3]: simulated weights 
# We will use two pivots for default ECR algorithm:
# the first one corresponds to iteration \code{mapindex} (complete MAP)
# the second one corresponds to iteration \code{mapindex.non} (observed MAP)

z<-data_list$"z"
K<-data_list$"K"
x<-data_list$"x"
mapindex<-data_list$"mapindex"
mapindex.non<-data_list$"mapindex.non"
# The PRA method will use as pivot the iteration that corresponds to
# the observed MAP estimate (mapindex). 

#Apply (a subset of the available) methods by typing:

ls<-label.switching(method=c("ECR","ECR-ITERATIVE-1","PRA", "AIC","DATA-BASED"),
zpivot=z[c(mapindex,mapindex.non),],z = z,K = K, data = x,
prapivot = mcmc.pars[mapindex,,],mcmc = mcmc.pars)

#plot the raw and reordered means of the K=2 normal mixture components for each method
par(mfrow = c(2,4))
#raw MCMC output for the means (with label switching)
matplot(mcmc.pars[,,1],type="l",
xlab="iteration",main="Raw MCMC output",ylab = "means")
# Reordered outputs
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-1")$output[,,1],type="l",
xlab="iteration",main="ECR (1st pivot)",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-2")$output[,,1],type="l",
xlab="iteration",main="ECR (2nd pivot)",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"ECR-ITERATIVE-1")$output[,,1],
type="l",xlab="iteration",main="ECR-iterative-1",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"PRA")$output[,,1],type="l",
xlab="iteration",main="PRA",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"AIC")$output[,,1],type="l",
xlab="iteration",main="AIC",ylab = "means")
matplot(permute.mcmc(mcmc.pars,ls$permutations$"DATA-BASED")$output[,,1],type="l",
xlab="iteration",main="DATA-BASED",ylab = "means")

#######################################################
# if the useR wants to apply the STEPHENS and SJW algorithm as well:
# The STEPHENS method requires the classification probabilities
p<-data_list$"p"

# The SJW method needs to define the complete log-likelihood of the
# model. For the univariate normal mixture, this is done as follows:

complete.normal.loglikelihood<-function(x,z,pars){
	#x: denotes the n data points
	#z: denotes an allocation vector (size=n)
	#pars: K\times 3 vector of means,variance, weights
	# pars[k,1]: corresponds to the mean of component k
	# pars[k,2]: corresponds to the variance of component k
	# pars[k,3]: corresponds to the weight of component k
	g <- dim(pars)[1]
	n <- length(x)
	logl<- rep(0, n)
 	logpi <- log(pars[,3])
	mean <- pars[,1]
	sigma <- sqrt(pars[,2])
	logl<-logpi[z] + dnorm(x,mean = mean[z],sd = sigma[z],log = T)
	return(sum(logl))
}

# and then run (after removing all #):
#ls<-label.switching(method=c("ECR","ECR-ITERATIVE-1","ECR-ITERATIVE-2",
#"PRA","STEPHENS","SJW","AIC","DATA-BASED"),
#zpivot=z[c(mapindex,mapindex.non),],z = z,
#K = K,prapivot = mcmc.pars[mapindex,,],p=p,
#complete = complete.normal.loglikelihood,mcmc.pars,
#data = x)