Description Usage Arguments Details Value Author(s) References See Also Examples
Fits multivariate tdistribution mixture models (with eigendecomposed covariance structure) to the given data within a clustering paradigm (default) or classification paradigm (by giving either training index or percentage of data taken to be known). Can be run in parallel.
1 2 3 4 5 
x 
A numeric matrix, data frame, or vector (for univariate data) . 
Gs 
A number or vector indicating the number of groups to fit. Default is 19. 
models 
A character vector giving the models to fit. See details for a comprehensive list of choices. 
init 
A list of initializing classification of the form that 
scale 
Logical indicating whether or not the function should scale the data. Default is 
dfstart 
The initialized value for the degrees of freedom. The default is 50. 
training 
Optional indexing vector for the observations whose classification is taken to be known. 
known 
A vector of known classifications that can be numeric or character  optional for clustering, necessary for classification. Must be the same length as the number of rows in the data set. If using in a true classification sense, give samples with unknown classification the value 
gauss 
Logical indicating if the algorithm should use the gaussian distribution. If 
dfupdate 
Character string or logical indicating how the degrees of freedom should be estimated. The default is 
eps 
Vector (of size 2) giving tolerance values for the convergence criterion. First value is the tolerance level for iterated Msteps. Second value is tolerance for the EM algorithm: convergence is based on Aitken's acceleration, see cited papers for more information. 
verbose 
Logical indicating whether the running output should be displayed. This option is not available in parallel. What is displayed depends on the width of the R window. With a width of 80 or larger: time run, estimated time remaining, percent complete are all displayed. 
maxit 
Vector (of size 2) giving maximum iteration number for the iterated Msteps and EM algorithm, respectively. A warning is displayed if either of these maximums are met, default for both is Inf (aka, no limit). 
convstyle 
Character string specifying the method of determining convergence. Default is "aitkens" which uses a criteria based on Aitken's acceleration, but "lop" (lack of progress) may be used instead. 
parallel.cores 
Logical indicating whether to run teigen in parallel or not. If 
ememargs 
A list of the controls for the emEM initialization with named elements:

Model specification (via the models
argument) follows either the nomenclature discussed in Andrews and McNicholas (2012), or via the nomenclature popularized in other packages. In both cases, the nomenclature refers to the decomposition and constraints on the covariance matrix:
Σ_g = λ_g D_g A_g D_g'
The nomenclature from Andrews and McNicholas (2012) gives four letters, each letter referring to (in order) λ, D, A, and the degrees of freedom. Possible letters are "U"
for unconstrained, "C"
for constrained (across groups), and "I"
for when the parameter is replaced by the appropriately sized identity matrix (where applicable). As an example, the string "UICC"
would refer to the model where Σ_g = λ_g A with degrees of freedom held equal across groups.
The alternative nomenclature describes (in order) the volume (λ), shape (A), orientation (D), and degrees of freedom in terms of "V"
ariable, "E"
qual, or the "I"
dentity matrix. The example model discussed in the previous paragraph would then be called by "VEIE"
.
Possible univariate models are c("univUU", "univUC", "univCU", "univCC")
where the first capital letter describes "U"
nconstrained or "C"
onstrained variance and the second capital letter refers to the degrees of freedom. Once again, "V"
ariable or "E"
qual can replace U and C, but this time the orders match between the nomenclatures.
As many models as desired can be selected and ran via the vector supplied to models
. More commonly, subsets can be called by the following character strings:
"all"
runs all 28 tEIGEN models (default),
"dfunconstrained"
runs the 14 unconstrained degrees of freedom models,
"dfconstrained"
runs the 14 constrained degrees of freedom models,
"mclust"
runs the 10 MCLUST models using the multivariate Gaussian distribution rather than the multivariate t,
"gaussian"
is similar but includes four further mixture models than MCLUST,
"univariate"
runs the univariate models  will automatically be called if one of the previous shortcuts is used on univariate data.
Note that adding "alt"
to the beginning of those previously mentioned characters strings will run the same models, but return results with the VEI nomenclature.
Also note that for G=1, several models are equivalent (for example, UUUU and CCCC). Thus, for G=1 only one model from each set of equivalent models will be run.
x 
Data used for clustering/classification. 
index 
Indexing vector giving observations taken to be known (only available when classification is performed). 
classification 
Vector of group classifications as determined by the BIC. 
bic 
BIC of the best fitted model. 
modelname 
Name of the best model according to the BIC. 
allbic 
Matrix of BIC values according to model and G. A value of Inf is returned when the model did not converge. 
bestmodel 
Character string giving best model (BIC) details. 
G 
Value corresponding to the number of components chosen by the BIC. 
tab 
Classification table for BICselected model (only available when 
fuzzy 
The fuzzy clustering matrix for the model selected by the BIC. 
logl 
The loglikelihood corresponding to the model with the best BIC. 
iter 
The number of iterations until convergence for the model selected by the BIC. 
parameters 
List containing the fitted parameters: 
iclresults 
List containing all the previous outputs, except 
info 
List containing a few of the original user inputs, for use by other dedicated functions of the 
Jeffrey L. Andrews, Jaymeson R. Wickins, Nicholas M. Boers, Paul D. McNicholas
Andrews JL and McNicholas PD. “Modelbased clustering, classification, and discriminant analysis with the multivariate tdistribution: The tEIGEN family” Statistics and Computing 22(5), 1021–1029.
Andrews JL, McNicholas PD, and Subedi S (2011) “Modelbased classification via mixtures of multivariate tdistributions” Computational Statistics and Data Analysis 55, 520–529.
See package manual tEIGEN
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39  ###Note that only one model is run for each example
###in order to reduce computation time
#Clustering old faithful data with hard random start
tfaith < teigen(faithful, models="UUUU", Gs=1:3, init="hard")
plot(tfaith, what = "uncertainty")
summary(tfaith)
#Clustering old faithful with hierarchical starting values
initial_list < list()
clustree < hclust(dist(faithful))
for(i in 1:3){
initial_list[[i]] < cutree(clustree,i)
}
tfaith < teigen(faithful, models="CUCU", Gs=1:3, init=initial_list)
print(tfaith)
#Classification with the iris data set
#Introducing NAs is not required; this is to illustrate a `true' classification scenario
irisknown < iris[,5]
irisknown[134:150] < NA
triris < teigen(iris[,5], models="CUUU", init="uniform", known=irisknown)
##Parallel examples:
###Note: parallel.cores set to 2 in order to comply
###with CRAN submission policies (set to higher
###number or TRUE to automatically use all available cores)
#Clustering old faithful data with tEIGEN
tfaith < teigen(faithful, models="UUUU",
parallel.cores=2, Gs=1:3, init="hard")
plot(tfaith, what = "contour")
#Classification with the iris data set
irisknown < iris[,5]
irisknown[sample(1:nrow(iris),50)] < NA
tiris < teigen(iris[,5], parallel.cores=2, models="CUUU",
init="uniform", known=irisknown)
tiris$tab

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.