View source: R/clustering.sc.dp.R
backtracking.sc.dp | R Documentation |
Creates clustering for k
number of clusters by using the backtrack data produced by findwithinss.sc.dp()
.
backtracking.sc.dp(x, k, backtrack)
x |
a multi-dimensional array containing input data to be clustered |
k |
the number of clusters |
backtrack |
the backtrack data |
If the number of clusters is unknown findwithinss.sc.dp()
followed by backtracking.sc.dp()
can be used for performing clustering. If only subsequent elements of the input data may form a cluster method findwithinss.sc.dp()
calculates the exact minimum of the sum of squares of within-cluster distances (withinss) from each element to its corresponding cluster centre (mean) for different cluster numbers. The user may analyse the withinss in order to select the proper number of clusters. In this case, it is enough to run method backtracking.sc.dp()
only once. Another option is to run findwithinss.sc.dp()
once, repeat the backtracking.sc.dp()
step for a range of potential cluster numbers and then the user may evaluate the optimal solutions created for different number of clusters. This requires much less time than repeating the whole clustering algorithm for the different cluster numbers.
An object of class 'clustering.sc.dp
' which has a print method and is a list with components:
cluster |
A vector of integers ( |
centers |
A matrix whose rows represent cluster centres. |
withinss |
The within-cluster sum of squares for each cluster. |
size |
The number of points in each cluster. |
Tibor Szkaliczki szkaliczki.tibor@sztaki.hu
findwithinss.sc.dp
, clustering.sc.dp
# Example: clustering data generated from a random walk with small withinss x<-matrix(, nrow = 100, ncol = 2) x[1,]<-c(0,0) for(i in 2:100) { x[i,1]<-x[i-1,1] + rnorm(1,0,0.1) x[i,2]<-x[i-1,2] + rnorm(1,0,0.1) } k<-10 r<-findwithinss.sc.dp(x,k) # select the first cluster number where withinss drops below a threshold thres <- 5.0 k_th <- 1; while(r$twithinss[k_th] > thres & k_th < k) { k_th <- k_th + 1 } # backtrack result<-backtracking.sc.dp(x,k_th, r$backtrack) plot(x, type = 'b', col = result$cluster) points(result$centers, pch = 24, bg = (1:k_th))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.