selectMix: Mixture Model Selection
In sppmix: Modeling Spatial Poisson and Related Point Processes

Description Usage Arguments Details Value Author(s) References See Also Examples

This function suggests the best number of components by computing model selection criteria, including AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), ICLC (Integrated Classification Likelihood Criterion).

Since the only parameter of interest is the number of components of the mixture, we consider several fixed numbers of components defined in the vector Ms, and we entertain mixture models with their other parameters approximated via the MAP estimators of DAMCMC runs.

For examples see

http://faculty.missouri.edu/~micheasa/sppmix/sppmix_all_examples.html#selectMix

1 2	selectMix(pp, Ms, L = 30000, burnin = 0.1 * L, truncate = FALSE, runallperms = 0)

`pp`	Point pattern object of class `ppp`.
`Ms`	A vector of integers, representing different numbers of components to assess for the mixture model for the intensity function.
`L`	Number of iterations for the DAMCMC we run for each number of components in `Ms`; default is 30000.
`burnin`	Number of initial realizations to discard. By default, it is 1/10 of the total number of iterations.
`truncate`	Logical variable indicating whether or not we normalize the densities of the mixture components to have all their mass within the window defined in the point pattern `pp`.
`runallperms`	Set to 0 to use an approximation to the Likelihood and Entropy within the MCMC (not affected by label switching). Set to 1 to use an identifiability constraint to permute the labels and use the posterior means of the parameters to compute the criteria. Set to 2 to use the decision theoretic approach (minimize Squared Error Loss) in order to permute the labels. The latter setting can take a long time to run for m>7.

For each integer in the vector Ms, we fit a mixture with that many components using DAMCMC. Then the criteria are computed and presented at the end of the calculations.

Note that the AIC and BIC do not account for constraints in the parameter space of the mixture model parameters. The ICLC uses the estimated entropy of the distribution of the membership indicators and therefore should be trusted more in identifying the true number of components, instead of AIC and BIC.

In addition, we run Stephens' BDMCMC and present the posterior distribution for the number of components.

All these methods should serve us in making an informed choice about the true number of components and then proceed to fit the DAMCMC with the chosen number of components and take care of label switching (if present), in order to achieve mixture deconvolution. If we simply want the surface, then the Bayesian model average from the BDMCMC fit is the best solution.

A list containing the following components:

`AIC`	the values of the AIC criterion
`BIC`	the values of the BIC criterion
`ICLC`	the values of the ICLC criterion
`Marginal`	the values of the marginal density
`LogLikelihood`	the values of the LogLikelihood

Jiaxun Chen, Sakis Micheas

Stephens, M. (2000). Bayesian analysis of mixture models with an unknown number of components: an alternative to reversible jump methods. The Annals of Statistics, 28, 1, 40-74.

McLachlan, G., and Peel, D. (2000). Finite Mixture Models. Wiley-Interscience.

Jasra, A., Holmes, C.C. and Stephens, D. A. (2005). Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture. Statistical Science, 20, 50-67.

normmix, square, est_mix_damcmc, est_mix_bdmcmc, GetBMA, FixLS_da, rsppmix

# create the true mixture intensity surface
truesurf <- normmix(ps=c(.2, .6,.2), mus=list(c(0.3, 0.3), c(0.7, 0.7), c(0.5, 0.5)),
 sigmas = list(.01*diag(2), .01*diag(2), .01*diag(2)), lambda=100, win=spatstat::square(1))
plot(truesurf)
# generate the point pattern, truncate=TRUE by default
pp <- rsppmix(truesurf,truncate=FALSE)
plot(pp,mus=truesurf$mus)
# compute model selection criteria via an approximation that is not affected by label
# switching and will typically work well for large L
ModelSel=selectMix(pp,1:5,truncate=FALSE)
# show info
ModelSel
#generate the intensity surface randomly
truesurf <- rmixsurf(5,100,xlim = c(-3,3), ylim = c(-3,3), rand_m = TRUE)
truesurf
pp <- rsppmix(truesurf,truncate=FALSE)
ModelSel0=selectMix(pp,1:5,runallperms = 0, truncate=FALSE)
ModelSel1=selectMix(pp,1:5,runallperms = 1, truncate=FALSE)
ModelSel2=selectMix(pp,1:5,runallperms = 2, truncate=FALSE)