optimal_phyloregion: Determine optimal number of clusters
In darunabas/phyloregion: Biogeographic Regionalization and Macroecology

optimal_phyloregion

R Documentation

Determine optimal number of clusters

Description

This function divides the hierarchical dendrogram into meaningful clusters ("phyloregions"), based on the ‘elbow’ or ‘knee’ of an evaluation graph that corresponds to the point of optimal curvature.

Usage

optimal_phyloregion(x, method = "average", k = 20)

Arguments

`x`	a numeric matrix, data frame or “dist” object.
`method`	the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of “ward.D”, “ward.D2”, “single”, “complete”, “average” (= UPGMA), “mcquitty” (= WPGMA), “median” (= WPGMC) or “centroid” (= UPGMC).
`k`	numeric, the upper bound of the number of clusters to compute. DEFAULT: 20 or the number of observations (if less than 20).

Value

a list containing the following as returned from the GMD package (Zhao et al. 2011):

k: optimal number of clusters (bioregions)
totbss: total between-cluster sum-of-square
tss: total sum of squares of the data
ev: explained variance given k

References

Salvador, S. & Chan, P. (2004) Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. Proceedings of the Sixteenth IEEE International Conference on Tools with Artificial Intelligence, pp. 576–584. Institute of Electrical and Electronics Engineers, Piscataway, New Jersey, USA.

Zhao, X., Valen, E., Parker, B.J. & Sandelin, A. (2011) Systematic clustering of transcription start site landscapes. PLoS ONE 6: e23409.

Examples

data(africa)
tree <- africa$phylo
bc <- beta_diss(africa$comm)
(d <- optimal_phyloregion(bc[[1]], k=15))
plot(d$df$k, d$df$ev, ylab = "Explained variances",
  xlab = "Number of clusters")
lines(d$df$k[order(d$df$k)], d$df$ev[order(d$df$k)], pch = 1)
points(d$optimal$k, d$optimal$ev, pch = 21, bg = "red", cex = 3)
points(d$optimal$k, d$optimal$ev, pch = 21, bg = "red", type = "h")

darunabas/phyloregion documentation built on Oct. 27, 2024, 10:01 p.m.