Gap: Gap statistics
In mattmail/clusterAnalysis: Several tools for determining the number of clusters in a dataset

Description Usage Arguments Value References

View source: R/gap.R

Tibshirani's gap statistic for the determination of the number of clusters. It computes the within cluster dispertion of the partition and it compares it with the within cluster dispertion of generated datasets having similar statistics to the original. The within-cluster dispertion is the normalized sum for each cluster of the sum of the distance between each pair in a cluster.

1 2	Gap(X, maxK, clusterAlg = myKmean, B = 50, null_distrib = "gaussian", verbose = TRUE, ...)

`X`	data matrix or data frame of size n x d, n observations and d features
`maxK`	maximum number of clusters to evaluate.
`clusterAlg`	clustering algorithm. Its output must be a list having a compoment "cluster" containing the assignation of each observation. For more details, check the formatting of function `myKmean`.
`B`	number of reference datasets to generate
`null_distrib`	type of the null hypothesis. Can either be "gaussian", "uniform" or "uniformity". "gaussian" draws observations from a mulidimensional normal distribution with the same mean and variance as in the original dataset for each feature . "uniform" draws uniformely observations in the range of each feature. "uniformity" draws observation from a uniform distribution as in gap statistics (Tibshirani et al. 2001)
`verbose`	logical, if TRUE, plots the evolution of the algorithm
`...`	additional parameters for the clustering algorithm

list of 3 components

kopt: optimal number of clusters
gap: vector of values for the gap statistic
s: empirical standard deviation of the gap statistic

Tibshirani, R., Walther, G., and Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic.Journal of the Royal Statistical Society Series B, 63:411-423.

mattmail/clusterAnalysis documentation built on Nov. 4, 2019, 6:18 p.m.

mattmail/clusterAnalysis index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mattmail/clusterAnalysis
Several tools for determining the number of clusters in a dataset

Gap: Gap statistics
In mattmail/clusterAnalysis: Several tools for determining the number of clusters in a dataset

Description

Usage

Arguments

Value

References

Related to Gap in mattmail/clusterAnalysis...

R Package Documentation

Browse R Packages

We want your feedback!

mattmail/clusterAnalysis Several tools for determining the number of clusters in a dataset

Gap: Gap statistics In mattmail/clusterAnalysis: Several tools for determining the number of clusters in a dataset

Description

Usage

Arguments

Value

References

Related to Gap in mattmail/clusterAnalysis...

R Package Documentation

Browse R Packages

We want your feedback!

mattmail/clusterAnalysis
Several tools for determining the number of clusters in a dataset

Gap: Gap statistics
In mattmail/clusterAnalysis: Several tools for determining the number of clusters in a dataset