TDDclust | R Documentation |
This is the trimmed version of the clustering algorithm based on the L1 depth proposed by Rebecka Jornsten (2004). She segments all the observations in clusters, and assigns to each point z in the data space, the L1 depth value regarding its cluster. A trimmed procedure is incorporated to remove the more extreme individuals of each cluster (those one with the lowest depth values), in line with trimowa
.
TDDclust(data,numClust,lambda,Th,niter,T0,simAnn,alpha,data1,verbose=TRUE)
data |
Data frame. Each row corresponds to an observation, and each column corresponds to a variable. All variables must be numeric. |
numClust |
Number of clusters. |
lambda |
Tuning parameter that controls the influence the data depth has over the clustering, see Jornsten (2004). |
Th |
Threshold for observations to be relocated, usually set to 0. |
niter |
Number of random initializations (iterations). |
T0 |
Simulated annealing parameter. It is the current temperature in the simulated annealing procedure. |
simAnn |
Simulated annealing parameter. It is the decay rate, default 0.9. |
alpha |
Proportion of trimmed sample. |
data1 |
The same data frame as data, used to incorporate the trimmed observations into the rest of them for the next iteration. |
verbose |
A logical specifying whether to provide descriptive output about the running process. Default TRUE. |
A list with the following elements:
NN: Cluster assignment, NN[1,] is the final partition.
cases: Anthropometric cases (the multivariate median cluster representatives).
DD: Depth values of the observations (only if there are trimmed observations).
Cost: Final value of the optimal partition.
discarded: Discarded (trimmed) observations.
klBest: Iteration in which the optimal partition was found.
This function has been defined from the original functions developed by Rebecka Jornsten, which were available freely on http://www.stat.rutgers.edu/home/rebecka/DDcl/. However, the link to this page doesn't currently exist as a result of a website redesign.
Jornsten R., (2004). Clustering and classification based on the L1 data depth, Journal of Multivariate Analysis 90, 67–89
Vinue, G., and Ibanez, M. V., (2014). Data depth and Biclustering applied to anthropometric data. Exploring their utility in apparel design. Technical report.
#In the interests of simplicity of the computation involved, only 15 points are selected: dataTDDcl <- sampleSpanishSurvey[1 : 15, c(2, 3, 5)] dataTDDcl_aux <- sampleSpanishSurvey[1 : 15, c(2, 3, 5)] numClust <- 3 ; alpha <- 0.01 ; lambda <- 0.5 ; niter <- 2 Th <- 0 ; T0 <- 0 ; simAnn <- 0.9 #For reproducing results, seed for randomness: #suppressWarnings(RNGversion("3.5.0")) #set.seed(2014) res_TDDcl <- TDDclust(dataTDDcl, numClust, lambda, Th, niter, T0, simAnn, alpha, dataTDDcl_aux,FALSE) prototypes <- anthrCases(res_TDDcl) table(res_TDDcl$NN[1,]) res_TDDcl$Cost res_TDDcl$klBest trimmed <- trimmOutl(res_TDDcl)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.