Description Usage Arguments Value Author(s) References Examples
View source: R/cluster_stability.R
Computes cluster the stability for different values of k via the non-parametric bootstrap.
1 | cluster_stability(dist, kseq, B, norm = FALSE, ...)
|
dist |
p x p distance matrix, where p is the number of objects. |
kseq |
A sequence of cluster sizes, for which cluster stability should be computed. |
B |
Number of bootstrap samples |
norm |
Default is FALSE and corresponds to the method by Fang & Wang, 2012. |
... |
Additional arguments passed to hclust. |
The function returns a vector of cluster instability indices, one for each k in kseq
.
Jonas Haslbeck <jonashaslbeck@gmail.com>
Fang, Y., & Wang, J. (2012). Selection of the number of clusters via the bootstrap method. Computational Statistics & Data Analysis, 56(3), 468-477.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ## Not run:
# simple Gaussian mixture
data <- c(rnorm(100,0,1),
rnorm(100,5,1),
rnorm(100,10,1))
hist(data, breaks=40) # look at mixture
# compute distance matrix
dist <- as.matrix(dist(data))
kseq <- 2:10 # define k sequence of interest
set.seed(1) # make reproducible
instobj <- cluster_stability(dist, kseq, B=25)
# visualize instability as a function of k:
plot(kseq, instobj, ylim=c(0,.15), type='l',
xlab='k', ylab='Cluster Instability')
# correctly identifies k=3!
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.