RobustTrimmedClustering: Robust Trimmed Clustering

View source: R/RobustTrimmedClustering.R

RobustTrimmedClusteringR Documentation

Robust Trimmed Clustering

Description

Robust Trimmed Clustering invented by [Garcia-Escudero et al., 2008] and implemented by [Fritz et al., 2012].

Usage

RobustTrimmedClustering(Data, ClusterNo,

Alpha=0.05,PlotIt=FALSE,...)

Arguments

Data

[1:n,1:d] matrix of dataset to be clustered. It consists of n cases of d-dimensional data points. Every case has d attributes, variables or features.

ClusterNo

A number k which defines k different clusters to be built by the algorithm.

PlotIt

Default: FALSE, if TRUE plots the first three dimensions of the dataset with colored three-dimensional data points defined by the clustering stored in Cls

Alpha

No trimming is done equals to alpha =0, otherwise proportion of datapoints to be trimmed, tclust uses 0.05 as default.

...

Further arguments to be set for the clustering algorithm, e.g. ,nstart (number of random initializations),iter.max (maximum number of concentration steps),restr and restr.fact described in details. If not set, default arguments are used.

Details

"This iterative algorithm initializes k clusters randomly and performs "concentration steps" in order to improve the current cluster assignment. The number of maximum concentration steps to be performed is given by iter.max. For approximately obtaining the global optimum, the system is initialized nstart times and concentration steps are performed until convergence or iter.max is reached. When processing more complex data sets higher values of nstart and iter.max have to be specified (obviously implying extra computation time). ... The larger restr.fact is chosen, the looser is the restriction on the scatter matrices, allowing for more heterogeneity among the clusters. On the contrary, small values of restr.fact close to 1 imply very equally scattered clusters. This idea of constraining cluster scatters to avoid spurious solutions goes back to Hathaway (1985), who proposed it in mixture fitting problems" [Fritz et al., 2012]. The type of constraint restr can be set to "eigen", "deter" or "sigma.". Please see tclust for further parameter description.

Value

List of

Cls

[1:n] numerical vector with n numbers defining the classification as the main output of the clustering algorithm. It has k unique numbers representing the arbitrary labels of the clustering.

Object

Object defined by clustering algorithm as the other output of this algorithm

Author(s)

Michael Thrun

References

[Garcia-Escudero et al., 2008] Garcia-Escudero, L. A., Gordaliza, A., Matran, C., & Mayo-Iscar, A.: A general trimming approach to robust cluster analysis, The annals of Statistics, Vol. 36(3), pp. 1324-1345. 2008.

[Fritz et al., 2012] Fritz, H., Garcia-Escudero, L. A., & Mayo-Iscar, A.: tclust: An R package for a trimming approach to cluster analysis, Journal of statistical Software, Vol. 47(12), pp. 1-26. 2012.

Examples

data('Hepta')
out=RobustTrimmedClustering(Hepta$Data,ClusterNo=7,Alpha=0,PlotIt=FALSE)

Mthrun/FCPS documentation built on June 28, 2023, 9:29 a.m.