# bootstrap: Bootstrap In EMMIXskew: The EM Algorithm and Skew Mixture Distribution

## Description

The standard error analysis and the bootstrap analysis of -2log(Lambda).

## Usage

 ```1 2 3 4``` ```bootstrap(x,n,p,g,distr,ncov,popPAR,B=99,replace=TRUE, itmax=1000,epsilon=1e-5) bootstrap.noc(x,n,p,g1,g2,distr,ncov,B=99,replace=TRUE, itmax=1000,epsilon=1e-5) ```

## Arguments

 `n` The number of observations `p` The dimension of data `B` The number of simulated data or replacements to be tried `x` The dataset, an n by p numeric matrix, where n is number of observations and p the dimension of data. `g` The number of components of the mixture model `g1,g2` The range of the number of components of the mixture model `distr` A three letter string indicating the type of distribution to be fit. See Details. `ncov` A small integer indicating the type of covariance structure. See Details. `popPAR` A list with components `pro`, a numeric vector of the mixing proportion of each component; `mu`, a p by g matrix with each column as its corresponding mean; `sigma`, a three dimensional p by p by g array with its jth component matrix (p,p,j) as the covariance matrix for jth component of mixture models; `dof`, a vector of degrees of freedom for each component; `delta`, a p by g matrix with its columns corresponding to skew parameter vectors. `replace` A logical value indicating whether replacement to be used `itmax` A big integer specifying the maximum number of iterations to apply `epsilon` A small number used to stop the EM algorithm loop when the relative difference between log-likelihood at each iteration become sufficient small.

## Details

The distribution type, `distr`, is one of the following values: "mvn" for a multivariate normal, "mvt" for a multivariate t-distribution, "msn" for a multivariate skew normal distribution and "mst" for a multivariate skew t-distribution.

The covariance matrix type, represented by the `ncov` parameter, may be any one of the following: `ncov`=1 for a common variance, `ncov`=2 for a common diagonal variance, `ncov`=3 for a general variance, `ncov` =4 for a diagonal variance, `ncov`=5 for sigma(h)*I(p)(diagonal covariance with same identical diagonal element values).

When `replace` is FALSE, parametric bootstrap is used; otherwise replacement method is used.

## Value

`bootstrap` gives standard errors. `bootstrap.noc` returns a list with components `ret`, a B by (g2-g1) matrix of -2log(Lambda), `vlk`, the loglikehood for each g in the range of g1 to g2, and `pvalue`, the p-values of g vs g+1. The results of fitting mixture models are stored in curent working directory, which can be used via command in R: obj <- dget("ReturnOf_g_???.ret").

## References

McLachlan G.J. and Krishnan T. (2008). The EM Algorithm and Extensions (2nd). New Jersay: Wiley.

McLachlan G.J. and Peel D. (2000). Finite Mixture Models. New York: Wiley.

`EmSkew`,`rdemmix`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61``` ```n1=300;n2=300;n3=400; nn <-c(n1,n2,n3) n <- sum(nn) p <- 2 g <- 3 sigma<-array(0,c(p,p,g)) for(h in 1:3) sigma[,,h]<-diag(p) mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0)) # for other distributions, #delta <- cbind(c(3,3),c(1,5),c(-3,1)) #dof <- c(3,5,5) distr="mvn" ncov=3 #first we generate a data set set.seed(111) #random seed is set dat <- rdemmix(nn,p,g,distr,mu,sigma,dof=NULL,delta=NULL) #start from initial partition clust<- rep(1:g,nn) obj <- EmSkewfit1(dat,g,clust,distr,ncov,itmax=1000,epsilon=1e-5) # do bootstrap (stadard error analysis) ## Not run: std <- bootstrap(dat,n,p,g,distr,ncov,obj,B=19, replace=TRUE,itmax=1000,epsilon=1e-5) print(std) # do booststrap analysis of -2log(Lambda). # alternatively data can be input as follow, # dat <- read.table("mydata.txt",header=TRUE) # p <- ncol(dat) # n <- nrow(dat) lad <- bootstrap.noc(dat,n,p,2,4,distr,ncov,B=19, replace=FALSE,itmax=1000,epsilon=1e-5) print(lad) # return of g=2 obj2 <- dget("ReturnOf_g_2.ret") # return of g=3 obj3 <- dget("ReturnOf_g_3.ret") # return of g=4 obj4 <- dget("ReturnOf_g_4.ret") #The posterior probability matrix for (g=3) is obtained by tau <- obj3\$tau ## End(Not run) ```