Determines the number of components in the distribution.

Description

The function selects the number of components in the distribution using a variation of the Bayesian Information Criterion (BIC). The trimmed BIC uses a trimmed likelihood and a complexity penalty term to optimally determine the number of mixture components. It uses a range of values as number of components and returns the value that gives the maximum trimmed BIC.

Usage

1
trimmed_bic(data, alpha, end, method=c("reg","rcm","kotz"),iter_max=100)

Arguments

data

This is a matrix or data frame of observations, where rows correspond to n observations and columns correspond to d variables. Categorical variables are not allowed.

alpha

This is the trimming percentage in the calculation of the trimmed BIC. The alpha value ranges from 0 to 0.5. alpha = 0 corresponds the conventional BIC.

end

This is an integer value that represents the maximum number of components in the mixture models considered. The minimum number compoents is always set to be 2.

method

This specifies which of the algorithms is to be used. Presently there are three algorithms that can be used. These are reg(regular-EM),rcm(spatial-EM), kotz(kotz-EM).

iter_max

This is a parameter maxiter. It is the maximum number of iterations of the EM algorithm. The default value is 100. If the EM algorithm has not converged at this iteration, the parameters for the 100th iteration is returned.

Value

bic

A list containing the BIC computed in the range.

k

The optimal number of components selected.

References

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
x1 <- matrix(rnorm(2*200),ncol=2)
x2 <- matrix(rnorm(2*200,2,1),ncol=2)
data <- rbind(x1,x2)
epsilon <- 0.5
end <- 3
iter_max <- 50
trimmed_bic(data,epsilon,end,"rcm",50)

## End(Not run)