nopenalty: Classical Model-based Clustering

Description Usage Arguments Details Value References See Also Examples

View source: R/nopenalty.R

Description

This function estimates the model-based clustering which is under the framework of finite mixture models.

Usage

1
2
3
nopenalty(K, y, N = 100, kms.iter = 100, kms.nstart = 100,
           eps.diff = 1e-5, eps.em = 1e-5,
           model.crit = 'gic', short.output = FALSE)

Arguments

K

A vector of the number of clusters

y

A p-dimensional data matrix. Each row is an observation

N

The maximum number of iterations in the EM algorithm. The default value is 100.

kms.iter

The maximum number of iterations in the K-means algorithm whose outputs are the starting values for the EM algorithm

kms.nstart

The number of starting values in K-means

eps.diff

The lower bound of pairwise difference of two mean values. Any value lower than it is treated as 0

eps.em

The lower bound for the stopping criterion.

model.crit

The criterion used to select the number of clusters K. It is either ‘bic’ for Bayesian Information Criterion or ‘gic’ for Generalized Information Criterion.

short.output

A short version of output is needed or not. A short version is used for computing the adaptive parameters in APFP or APL1 methods. The default value is FALSE.

Details

This function estimates parameters μ, Σ, π and the clustering assignments in the model-based clustering using the mixture model,

y \sim ∑_{k=1}^K π_k f(y|μ_k, Σ)

where f(y|μ_k, Σ_k) is the density function of Normal distribution with mean μ_k and variance Σ. Here we assume that each cluster has the same diagonal variance.

This function is also used to compute the adaptive parameters for functions apfp and apL1.

Value

This function returns the esimated parameters and some statistics of the optimal model within the given K and λ, which is selected by BIC when model.crit = 'bic' or GIC when model.crit = 'gic'.

mu.hat.best

The estimated cluster means.

sigma.hat.best

The estimated covariance.

p.hat.best

The estimated cluster proportions.

s.hat.best

The clustering assignments.

K.best

The value of K that provides the optimal model

llh.best

The log-likelihood of the optimal model

gic.best

The GIC of the optimal model

bic.best

The BIC of the optimal model

ct.mu.best

The degrees of freedom in the cluster means of the optimal model

References

Fraley, C., & Raftery, A. E. (2002) Model-based clustering, discriminant analysis, and density estimation. Journal of the American statistical Association 97(458), 611–631.

See Also

apfp apL1 parse

Examples

1
2
3
y <- rbind(matrix(rnorm(100,0,1),ncol=2), matrix(rnorm(100,4,1), ncol=2))
output <- nopenalty(K = c(1:2), y)
output$mu.hat.best

PARSE documentation built on May 30, 2017, 1:16 a.m.