Description Usage Arguments Details Value Note Author(s) References Examples

Carries out model-based clustering or classification using some or all of the 14 parsimonious Skew-t clustering models (STPCM).

1 2 3 4 5 |

`data` |
A matrix or data frame such that rows correspond to observations and columns correspond to variables. Note that this function currently only works with multivariate data p > 1. |

`G` |
A sequence of integers giving the number of components to be used. |

`mnames` |
The models (i.e., covariance structures) to be used. If |

`start` |
If |

`label` |
If |

`veo` |
Stands for "Variables exceed observations". If |

`da` |
Stands for Determinstic Annealing. A vector of doubles. |

`nmax` |
The maximum number of iterations each EM algorithm is allowed to use. |

`atol` |
A number specifying the epsilon value for the convergence criteria used in the EM algorithms. For each algorithm, the criterion is based on the difference between the log-likelihood at an iteration and an asymptotic estimate of the log-likelihood at that iteration. This asymptotic estimate is based on the Aitken acceleration and details are given in the References. |

`mtol` |
A number specifying the epsilon value for the convergence criteria used in the M-step in the EM algorithms. |

`mmax` |
The maximum number of iterations each M-step is allowed in the GEM algorithms. |

`burn` |
The burn in period for imputing data. (Missing observations are removed and a model is estimated seperately before placing an imputation step within the EM.) |

`pprogress` |
If |

`pwarning` |
If |

`stochastic` |
If |

The data `x`

are either clustered or classified using Skew-t mixture models with some or all of the 14 parsimonious covariance structures described in Celeux & Govaert (1995). The algorithms given by Celeux & Govaert (1995) is used for 12 of the 14 models; the "EVE" and "VVE" models use the algorithms given in Browne & McNicholas (2014). Starting values are very important to the successful operation of these algorithms and so care must be taken in the interpretation of results.

An object of class `vgpcm`

is a list with components:

`map` |
A vector of integers indicating the maximum |

`model_objs` |
A list of all estimated models with parameters returned from the C++ call. |

`best_model` |
A class of vgpcm_best containing; the number of groups for the best model, the covariance structure, and Bayesian Information Criterion (BIC) value. |

`loglik` |
The log-likelihood values from fitting the best model. |

`z` |
A matrix giving the raw values upon which |

`BIC` |
A G by mnames by 3 dimensional array with values pertaining to BIC calculations. (legacy) |

`gpar` |
A list object for each cluster pertaining to parameters. (legacy) |

`startobject` |
The type of object inputted into |

`row_tags` |
If there were NAs in the original dataset, a vector of indices referencing the row of the imputed vectors is given. |

An object of class `stpcm_best`

is a list with components:

`model_type` |
A string containg summarized information about the type of model estimated (Covariance structure and number of groups). |

`model_obj` |
An internal list containing all parameters returned from the C++ call. |

`BIC` |
Bayesian Index Criterion (positive scale, bigger is better). |

`loglik` |
Log liklihood from the estimated model. |

`nparam` |
Number of a parameters in the mode. |

`startobject` |
The type of object inputted into |

`G` |
An integer representing the number of groups. |

`cov_type` |
A string representing the type of covariance matrix (see 14 models). |

`status` |
Convergence status of EM algorithm according to Aitken's Acceleration |

`map` |
A vector of integers indicating the maximum |

`row_tags` |
If there were NAs in the original dataset, a vector of indices referencing the row of the imputed vectors is given. |

All classes contain an internal list called `model_obj`

or `model_objs`

with the following components:

`zigs` |
a posteori matrix |

`G` |
An integer representing the number of groups. |

`sigs` |
A vector of covariance matrices for each group |

`mus` |
A vector of location vectors for each group |

`alphas` |
A vector containg skewness vectors for each group |

`gammas` |
A vector containing estimated gamma parameters for each group |

Dedicated `print`

, `plot`

and `summary`

functions are available for objects of class `vgpcm`

.

Nik Pocuca, Ryan P. Browne and Paul D. McNicholas.

Maintainer: Paul D. McNicholas <mcnicholas@math.mcmaster.ca>

McNicholas, P.D. (2016), *Mixture Model-Based Classification*. Boca Raton: Chapman & Hall/CRC Press

Browne, R.P. and McNicholas, P.D. (2014). Estimating common principal components in high dimensions. *Advances in Data Analysis and Classification* **8**(2), 217-226.

Wei, Y., Tang, Y. and McNicholas, P.D. (2019), 'Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete data', Computational Statistics and Data Analysis 130, 18-41.

Celeux, G., Govaert, G. (1995). Gaussian parsimonious clustering models. *Pattern Recognition* **28**(5), 781-793.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ```
data("sx3")
## Not run:
### estimate "VVV" "EVE"
ax = stpcm(sx3, G=1:3, mnames=c("VVV","EVE"), start=0)
summary(ax)
ax
### estimate all 14 covariance structures
ax = stpcm(sx3, G=1:3, mnames=NULL, start=0)
summary(ax)
ax
### model based classification
sx3.label = c(rep(1,1000),rep(2,1000))
plot(sx3, col=sx3.label)
axl = stpcm(sx3, G=2, mnames=c("VVV", "EVE"), label=sx3.label)
summary(axl)
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.