miscellaneous: Various Functions for Retrieving Information from Clustering...

criterionR Documentation

Various Functions for Retrieving Information from Clustering Results

Description

Various functions are available to retrieve the information criteria (criterion), the posterior probabilities of clustering memberships z (posterior), the “weights” u (importance), the uncertainty (uncertainty), and the estimates of the cluster proportions, means and variances (getEstimates) resulted from the clustering (filtering) operation.

Usage

criterion(object, ...)

## S4 method for signature 'flowClust'
criterion(object, type = "BIC")

## S4 method for signature 'flowClustList'
criterion(object, type = "BIC", max = FALSE, show.K = FALSE)

criterion(object) <- value

## S4 replacement method for signature 'flowClustList,character'
criterion(object) <- value

posterior(object, assign = FALSE)

importance(object, assign = FALSE)

uncertainty(object)

getEstimates(object, data)

Arguments

object

Object returned from flowClust or filter. For the replacement method of criterion, the object must be of class flowClustList or tmixFilterResultList.

...

Further arguments. Currently this is type, a character string. May take "BIC", "ICL" or "logLike", to specify the criterion desired.

type, value

A character string stating the criterion used to choose the best model. May take either "BIC" or "ICL".

max

whether criterion should return the max value

show.K

whether criterion should return K

assign

A logical value. If TRUE, only the quantity (z for posterior or u for importance) associated with the cluster to which an observation is assigned will be returned. Default is FALSE, meaning that the quantities associated with all the clusters will be returned.

data

A numeric vector, matrix, data frame of observations, or object of class flowFrame; an optional argument. This is the object on which flowClust or filter was performed.

Details

These functions are written to retrieve various slots contained in the object returned from the clustering operation. criterion is to retrieve object@BIC, object@ICL or object@logLike. It replacement method modifies object@index and object@criterion to select the best model according to the desired criterion. posterior and importance provide a means to conveniently retrieve information stored in object@z and object@u respectively. uncertainty is to retrieve object@uncertainty. getEstimates is to retrieve information stored in object@mu (transformed back to the original scale) and object@w; when the data object is provided, an approximate variance estimate (on the original scale, obtained by performing one M-step of the EM algorithm without taking the Box-Cox transformation) will also be computed.

Value

Denote by K the number of clusters, N the number of observations, and P the number of variables. For posterior and importance, a matrix of size N \times K is returned if assign=FALSE (default). Otherwise, a vector of size N is outputted. uncertainty always outputs a vector of size N. getEstimates returns a list with named elements, proportions, locations and, if the data object is provided, dispersion. proportions is a vector of size P and contains the estimates of the K cluster proportions. locations is a matrix of size K \times P and contains the estimates of the K mean vectors transformed back to the original scale (i.e., rbox(object@mu, object@lambda)). dispersion is an array of dimensions K \times P \times P, containing the approximate estimates of the K covariance matrices on the original scale.

Note

When object@nu=Inf, the Mahalanobis distances instead of the “weights” are stored in object@u. Hence, importance will retrieve information corresponding to the Mahalanobis distances. the assign argument is set to TRUE, only the quantities corresponding to assigned observations will be returned. Quantities corresponding to unassigned observations (outliers and filtered observations) will be reported as NA. Hence, A change in the rule to call outliers will incur a change in the number of NA values returned.

Author(s)

Raphael Gottardo <raph@stat.ubc.ca>, Kenneth Lo <c.lo@stat.ubc.ca>

References

Lo, K., Brinkman, R. R. and Gottardo, R. (2008) Automated Gating of Flow Cytometry Data via Robust Model-based Clustering. Cytometry A 73, 321-332.

See Also

flowClust, filter, Map


RGLab/flowClust documentation built on Jan. 31, 2024, 11:26 p.m.