cluster.stats: From fpc package 2.2-9 : cluster.stats function

cluster.statsR Documentation

From fpc package 2.2-9 : cluster.stats function

Description

From fpc package 2.2-9 : cluster.stats function

Usage

cluster.stats(
  d = NULL,
  clustering,
  alt.clustering = NULL,
  noisecluster = FALSE,
  silhouette = TRUE,
  G2 = FALSE,
  G3 = FALSE,
  wgap = TRUE,
  sepindex = TRUE,
  sepprob = 0.1,
  sepwithnoise = TRUE,
  compareonly = FALSE,
  aggregateonly = FALSE
)

Arguments

d

a distance object (as generated by dist) or a distance matrix between cases.

clustering

an integer vector of length of the number of cases, which indicates a clustering. The clusters have to be numbered from 1 to the number of clusters.

alt.clustering

an integer vector such as for clustering, indicating an alternative clustering. If provided, the corrected Rand index and Meila's VI for clustering vs. alt.clustering are computed.

noisecluster

logical. If TRUE, it is assumed that the largest cluster number in clustering denotes a 'noise class', i.e. points that do not belong to any cluster. These points are not taken into account for the computation of all functions of within and between cluster distances including the validation indexes.

silhouette

logical. If TRUE, the silhouette statistics are computed, which requires package cluster.

G2

logical. If TRUE, Goodman and Kruskal's index G2 (cf. Gordon (1999), p. 62) is computed. This executes lots of sorting algorithms and can be very slow (it has been improved by R. Francois - thanks!)

G3

logical. If TRUE, the index G3 (cf. Gordon (1999), p. 62) is computed. This executes sort on all distances and can be extremely slow.

wgap

logical. If TRUE, the widest within-cluster gaps (largest link in within-cluster minimum spanning tree) are computed. This is used for finding a good number of clusters in Hennig (2013).

sepindex

logical. If TRUE, a separation index is computed, defined based on the distances for every point to the closest point not in the same cluster. The separation index is then the mean of the smallest proportion sepprob of these. This allows to formalise separation less sensitive to a single or a few ambiguous points. The output component corresponding to this is sindex, not separation! This is used for finding a good number of clusters in Hennig (2013).

sepprob

numerical between 0 and 1, see sepindex.

sepwithnoise

logical. If TRUE and sepindex and noisecluster are both TRUE, the noise points are incorporated as cluster in the separation index (sepindex) computation. Also they are taken into account for the computation for the minimum cluster separation.

compareonly

logical. If TRUE, only the corrected Rand index and Meila's VI are computed and given out (this requires alt.clustering to be specified).

aggregateonly

logical. If TRUE (and not compareonly), no clusterwise but only aggregated information is given out (this cuts the size of the output down a bit).

See Also

cluster.stats


leca-dev/RFate documentation built on Sept. 19, 2024, 6:09 a.m.