Description Usage Arguments Details Value Author(s) References Examples
Performs model-based clustering using a mixture of SAL distributions. The expectation-maximization (EM) algorithm is used for parameter estimation, the Aitken's acceleration criterion is used to determine convergence, both the BIC and ICL values are given for the considered mixtures.
1 2 |
x |
A n by p matrix where each row corresponds a p-dimensional observation. |
G |
The desired number of mixture components. |
start |
Specifies how to intialize the zig matrix. If start equals 1, k-means clustering is used. If start equals 2, a random start is used. If start is a vector of length n, then the zig matrix is constructed based from this vector. |
max.it |
The desired number of iterations for the EM algorithm. |
eps |
The desired difference between the asymptotic estimate of the log-likelihood and the current log-likelihood value. |
print.it |
If True, the iteration number of the EM algorithm is printed. |
print.warn |
If True, the observation number that the mean vector is closet too is given. |
print.prmtrs |
If True, the parameter set is printed on each iteration of the EM algorithm. |
The mixture of SAL distributions are fitted using an EM algorithm with a “Set-Back” procedure to deal with the issue of Infinite Log-Likelihood Values that arise when updating the mean vector (see Section 3.4.2 of Franczak et.al (2014) for details).
The msal function outputs a list with the following components:
loglik |
A vector giving the log-likelihood values from each iteration of the considered EM algorithm. |
alpha |
A matrix where each row specifies the direction of skewness in each variable for each mixture component. |
sig |
An array where each matrix specifies the covariance matrix for each mixture component. |
mu |
A matrix where each row gives the mean vector for each mixture component. |
pi.g |
A vector specifying the mixing components. |
bic |
An integer giving the Bayesian Information Criterion (BIC) for the fitted model. |
icl |
An integer giving the Integrated Completed Likelihood (ICL) for the fitted model. |
cluster |
A vector of length n giving the group label for each observation in the considered data set. |
Brian C. Franczak [aut, cre], Ryan P. Browne [aut, ctb], Paul D. McNicholas [aut, ctb]
Maintainer: Brian C. Franczak <franczakb@macewan.ca>
Franczak et. al (2014). Mixtures of Shifted Asymmetric Laplace Distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(6), 1149-1157.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ## Clustering Simulated Data
alpha <- matrix(c(2,2,1,2),2,2)
sig <- array(NA,dim=c(2,2,2))
sig[,,1] <- diag(2)
sig[,,2] <- matrix(c(1,0.5,0.5,1),2,2)
mu <- matrix(c(0,0,-2,5),2,2)
pi.g <- rep(1/2,2)
x <- rmsal(n=500,p=2,alpha=alpha,sig=sig,mu=mu,pi.g=pi.g)
msal.ex1 <- msal(x=x[,-1],G=2)
table(x[,1],msal.ex1$cluster)
## Clustering the Old Faithful Geyser Data
data(faithful)
msal.ex2 <- msal(x=faithful,G=2)
plot(x=faithful,col=msal.ex2$cluster)
## Clustering the Yeast Data
data(yeast)
msal.ex3 <- msal(x=yeast[,-1],G=2)
table(yeast[,1],msal.ex3$cluster)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.