ModelExplorer: Stability measure for finding the number of clusters

Description Usage Arguments Value References

View source: R/ModelExplorer.R

Description

Model Explorer measures the stability of a given clustering method on subsambles of the original dataset. It plots the cumulative distribution of the stability measure. The more the distribution is concentrated on the right, the better.

Usage

1
2
ModelExplorer(X, maxK, similarity = adj.rand.index,
  clusterAlg = myKmean, rho = 0.8, B = 100, verbose = FALSE, ...)

Arguments

X

data matrix or data frame of size n x d, n observations and d features

maxK

maximum number of clusters to evaluate.

similarity

function measuring the similarity between two partitions.

clusterAlg

clustering algorithm. Its output must be a list having a compoment "cluster" containing the assignation of each observation. For more details, check the formatting of function myKmean.

rho

numeric between 0 and 1. Proportional size of the subsamples.

B

Number of resampling iterations.

verbose

logical. If TRUE, it plot the evolution of the algorithm.

...

additional parameters for the clustering algorithm.

Value

Matrix of size maxK-1 x B containing the stability measure for each iteration.

References

Ben-Hur, A., Elisseeff, A., and Guyon, I. (2002). A stability based method for discovering structure in clustered data.Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 2002:6-17


mattmail/clusterAnalysis documentation built on Nov. 4, 2019, 6:18 p.m.