# EmSkewfit: Fit the Multivariate Skew Mixture Models In EMMIXskew: The EM Algorithm and Skew Mixture Distribution

## Description

The engines to fit the data into mixture models using initial partition or initial values. set.

## Usage

 ```1 2``` ```EmSkewfit1(dat, g, clust, distr, ncov, itmax, epsilon,initloop=20) EmSkewfit2(dat, g, init, distr, ncov, itmax, epsilon) ```

## Arguments

 `dat` The dataset, an n by p numeric matrix, where n is number of observations and p the dimension of data. `g` The number of components of the mixture model `distr` A three letter string indicating the type of distribution to be fit. See Details. `ncov` A small integer indicating the type of covariance structure. See Details. `clust` A vector of integers specifying the initial partitions of the data `init` A list containing the initial parameters for the mixture model. See details. `itmax` A big integer specifying the maximum number of iterations to apply `epsilon` A small number used to stop the EM algorithm loop when the relative difference between log-likelihood at each iteration become sufficient small. `initloop` A integer specifying the number of initial loops

## Details

The distribution type, determined by the `distr` parameter, which may take any one of the following values: "mvn" for a multivariate normal, "mvt" for a multivariate t-distribution, "msn" for a multivariate skew normal distribution and "mst" for a multivariate skew t-distribution.

The covariance matrix type, represented by the `ncov` parameter, may be any one of the following: `ncov`=1 for a common variance, `ncov`=2 for a common diagonal variance, `ncov`=3 for a general variance, `ncov` =4 for a diagonal variance, `ncov`=5 for sigma(h)*I(p)(diagonal covariance with same identical diagonal element values).

The parameter `init` is a list with elements: `pro`, a numeric vector of the mixing proportion of each component; `mu`, a p by g matrix with each column as its corresponding mean; `sigma`, a three dimensional p by p by g array with its jth component matrix (p,p,j) as the covariance matrix for jth component of mixture models; `dof`, a vector of degrees of freedom for each component; `delta`, a p by g matrix with its columns corresponding to skew parameter vectors.

## Value

 `error` Error code, 0 = normal exit; 1 = did not converge within `itmax` iterations; 2 = failed to get the initial values; 3 = singularity `aic` Akaike Information Criterion (AIC) `bic` Bayes Information Criterion (BIC) `pro` A vector of mixing proportions, see Details. `mu` A numeric matrix with each column corresponding to the mean, see Details. `sigma` An array of dimension (p,p,g) with first two dimension corresponding covariance matrix of each component, see Details. `dof` A vector of degrees of freedom for each component, see Details. `delta` A p by g matrix with each column corresponding to a skew parameter vector, see Details. `clust` A vector of final partition `loglik` The loglikelihood at convergence `lk` A vector of loglikelihood at each EM iteration `tau` An n by g matrix of posterior probability for each data point

## References

McLachlan G.J. and Krishnan T. (2008). The EM Algorithm and Extensions (2nd). New Jersay: Wiley.

McLachlan G.J. and Peel D. (2000). Finite Mixture Models. New York: Wiley.

`init.mix`,`initEmmix`,`EmSkew`, `rdemmix`,`rdemmix2`,`rdmvn`,`rdmvt`,`rdmsn`, `rdmst`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50``` ```n1=300;n2=300;n3=400; nn <-c(n1,n2,n3) n=1000 p=2 ng=3 sigma<-array(0,c(2,2,3)) for(h in 2:3) sigma[,,h]<-diag(2) sigma[,,1]<-cbind( c(1,0),c(0,1)) mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0)) # for other distributions, #delta <- cbind(c(3,3),c(1,5),c(-3,1)) #dof <- c(3,5,5) pro <- c(0.3,0.3,0.4) distr="mvn" ncov=3 #first we generate a data set set.seed(111) #random seed is set dat <- rdemmix(nn,p,ng,distr,mu,sigma,dof=NULL,delta=NULL) #start from initial partition clust<- rep(1:ng,nn) obj1 <- EmSkewfit1(dat, ng, clust, distr, ncov, itmax=1000, epsilon=1e-4) #start from initial values #alternatively, if we define initial values like init<-list() init\$pro<-pro init\$mu<-mu init\$sigma<-sigma # for other distributions, #delta <- cbind(c(3,3),c(1,5),c(-3,1)) #dof <- c(3,5,5) #init\$dof<-dof #init\$delta<-delta obj2 <-EmSkewfit2(dat, ng, init, distr, ncov,itmax=1000, epsilon=1e-4) ```