Description Usage Arguments Details Value References See Also Examples
Obtains intial parameter set for use in the EM algorithm. Grouping of the data occurs through one of three possible clustering methods: k-means, random start, and hierarchical clustering.
1 2 |
dat |
The dataset, an n by p numeric matrix, where n is number of observations and p the dimension of data. |
g |
The number of components of the mixture model |
distr |
A three letter string indicating the type of distribution to be fit. See Details. |
ncov |
A small integer indicating the type of covariance structure. See Details. |
clust |
An initial partition of the data |
nkmeans |
An integer to specify the number of KMEANS partitions to be used to find the best initial values |
nrandom |
An integer to specify the number of random partitions to be used to find the best initial values |
nhclust |
A logical value to specify whether or not to use hierarchical cluster methods. If TRUE, the Complete Linkage method will be used. |
maxloop |
An integer to specify how many iterations to be tried to find the initial values,the default value is 10. |
The distribution type, determined by the distr
parameter, which may take any one of the following values:
"mvn" for a multivariate normal, "mvt" for a multivariate t-distribution, "msn" for a multivariate skew normal distribution and "mst" for a multivariate skew t-distribution.
The covariance matrix type, represented by the ncov
parameter, may be any one of the following:
ncov
=1 for a common variance, ncov
=2 for a common diagonal variance, ncov
=3 for a general variance, ncov
=4 for a diagonal variance, ncov
=5 for
sigma(h)*I(p)(diagonal covariance with same identical diagonal element values).
The return values include following components: pro
, a numeric vector of the mixing proportion of each component; mu
, a p by g matrix with each column as its corresponding mean;
sigma
, a three dimensional p by p by g array with its jth component matrix (p,p,j) as the covariance matrix for jth component of mixture models;
dof
, a vector of degrees of freedom for each component; delta
, a p by g matrix with its columns corresponding to skew parameter vectors.
When the dataset is huge, it becomes time-consuming to use a large maxloop to try every initial partition. The default is 10.
During the procedure to find the best inital clustering and intial values, for t-distribution and skew t-distribution, we don't estimate the degrees of freedom dof
, instead they are fixed at 4 for each component.
pro |
A vector of mixing proportions, see Details. |
mu |
A numeric matrix with each column corresponding to the mean, see Details. |
sigma |
An array of dimension (p,p,g) with first two dimension corresponding covariance matrix of each component, see Details. |
dof |
A vector of degrees of freedom for each component, see Details. |
delta |
A p by g matrix with each column corresponding to a skew parameter vector, see Details. |
McLachlan G.J. and Krishnan T. (2008). The EM Algorithm and Extensions (2nd). New Jersay: Wiley.
McLachlan G.J. and Peel D. (2000). Finite Mixture Models. New York: Wiley.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | sigma<-array(0,c(2,2,3))
for(h in 2:3) sigma[,,h]<-diag(2)
sigma[,,1]<-cbind( c(1,0.2),c(0.2,1))
mu <- cbind(c(4,-4),c(3.5,4),c( 0, 0))
delta <- cbind(c(3,3),c(1,5),c(-3,1))
dof <- c(3,5,5)
pro <- c(0.3,0.3,0.4)
n1=300;n2=300;n3=400;
nn<-c(n1,n2,n3)
n=1000
p=2
ng=3
distr="mvn"
ncov=3
#first we generate a data set
set.seed(111) #random seed is set
dat <- rdemmix(nn,p,ng,distr,mu,sigma,dof,delta)
clust<- rep(1:ng,nn)
initobj1 <- initEmmix(dat,ng,clust,distr, ncov)
initobj2 <- init.mix( dat,ng,distr,ncov,nkmeans=10,nrandom=0,nhclust=FALSE)
|
Loading required package: lattice
Loading required package: mvtnorm
Loading required package: KernSmooth
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.