findClusterNumber | R Documentation |
A function that is designed to find an approximation of the true
number. K, of clusters in a dataset. the findClusterNumber
function calls RandomSillyPutty
for each value of K in the
range from start
to end
, performing N
random
starts each time.
NOTE: start must be > 1, and the function can be slow depending on how complex the dataset is and the number of N iterations.
findClusterNumber(distobj, start,end, N = 100,
method = c("SillyPutty", "HCSP"), ...)
distobj |
An object of class |
start |
The minimum cluster number for the range of clusters |
end |
The maximum cluster number for the range of clusters |
N |
Number of iterations |
method |
whether to use the full |
... |
Extra arguments to the |
The findClusterNumber
function processes one distance matrix at
a time, through N iterations. It returns a list. The list
is a
list of the maximum silhoutte width values obtained from N iterations
with their associated cluster number.
A list containing the maximum silhouette width values per K clusters for each K in the range of possible cluster numbers.
Kevin R. Coombes krc@silicovore.com, Dwayne G. Tally dtally110@hotmail.com
Pending.
data(eucdist)
set.seed(12)
y <- findClusterNumber(eucdist, start = 3, end = 7, method = "HCSP")
plot(names(y), y, xlab = "K", ylab = "Mean Silhouette Width",
type = "b", lwd = 2, pch = 16)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.