nopenalty: Classical Model-based Clustering In PARSE: Model-Based Clustering with Regularization Methods for High-Dimensional Data

Description

This function estimates the model-based clustering which is under the framework of finite mixture models.

Usage

 ```1 2 3``` ```nopenalty(K, y, N = 100, kms.iter = 100, kms.nstart = 100, eps.diff = 1e-5, eps.em = 1e-5, model.crit = 'gic', short.output = FALSE) ```

Arguments

 `K` A vector of the number of clusters `y` A p-dimensional data matrix. Each row is an observation `N` The maximum number of iterations in the EM algorithm. The default value is 100. `kms.iter` The maximum number of iterations in the K-means algorithm whose outputs are the starting values for the EM algorithm `kms.nstart` The number of starting values in K-means `eps.diff` The lower bound of pairwise difference of two mean values. Any value lower than it is treated as 0 `eps.em` The lower bound for the stopping criterion. `model.crit` The criterion used to select the number of clusters K. It is either ‘bic’ for Bayesian Information Criterion or ‘gic’ for Generalized Information Criterion. `short.output` A short version of output is needed or not. A short version is used for computing the adaptive parameters in APFP or APL1 methods. The default value is FALSE.

Details

This function estimates parameters μ, Σ, π and the clustering assignments in the model-based clustering using the mixture model,

y \sim ∑_{k=1}^K π_k f(y|μ_k, Σ)

where f(y|μ_k, Σ_k) is the density function of Normal distribution with mean μ_k and variance Σ. Here we assume that each cluster has the same diagonal variance.

This function is also used to compute the adaptive parameters for functions `apfp` and `apL1`.

Value

This function returns the esimated parameters and some statistics of the optimal model within the given K and λ, which is selected by BIC when `model.crit = 'bic'` or GIC when `model.crit = 'gic'`.

 `mu.hat.best` The estimated cluster means. `sigma.hat.best` The estimated covariance. `p.hat.best` The estimated cluster proportions. `s.hat.best` The clustering assignments. `K.best` The value of K that provides the optimal model `llh.best` The log-likelihood of the optimal model `gic.best` The GIC of the optimal model `bic.best` The BIC of the optimal model `ct.mu.best` The degrees of freedom in the cluster means of the optimal model

References

Fraley, C., & Raftery, A. E. (2002) Model-based clustering, discriminant analysis, and density estimation. Journal of the American statistical Association 97(458), 611–631.

`apfp` `apL1` `parse`
 ```1 2 3``` ```y <- rbind(matrix(rnorm(100,0,1),ncol=2), matrix(rnorm(100,4,1), ncol=2)) output <- nopenalty(K = c(1:2), y) output\$mu.hat.best ```