compareClustering: Parallel Validation Of Clusteranalyses

View source: R/compareClustering.R

compareClusteringR Documentation

Parallel Validation Of Clusteranalyses

Description

Parallel Validation Of Clusteranalyses

Usage

compareClustering(dataMatrix, maxClusters, distanceMeasures = c("euclidean",
  "manhattan"), clusteringMethods = c("ward.D2", "single", "complete",
  "average", "mcquitty", "diana", "kmeans"), sfParallel = TRUE, sfCpus = 2,
  ...)

Arguments

dataMatrix

a data matrix accepted by stats::dist()

maxClusters

the maximum number of clusters to evaluate

distanceMeasures

a character vector of the distance measures to use (currently, only "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski" are allowed)

clusteringMethods

a character vector of cluster methods to use (currently, the following are allowed:

  • "ward.D", "ward.D2", "single", "complete" , "average", "mcquitty", "median", "centroid" (use stats::hclust)

  • "diana" (uses cluster::diana)

  • "kmeans" (uses cluster::pam)

)

sfParallel

logical Should snowfall be used in parallel mode?

sfCpus

number of cpu to use

...

passed to snowfall::sfInit()

Value

a tibble::tibble with one row per distance measure, method and number of clusters from 2 to k and the columns:

distance

= dm

method

= method

nCluster

= k

totalAvgSilWidth

overall average silhoutte width (cluster::summary.silhouette$avg.width)

minClustAvgSilWidth

minimal average cluster silhoutte width (cluster::summary.silhouette$clus.avg.widths)

minSilWidth

minimal silhoutte width (cluster::summary.silhouette$si.summary$`Min.`)

pPosSilWidths

percentage of positive silhoutte widths

minClustJacMean

minimal cluster bootstrap mean of Jaccard's index (fpc::clusterboot$bootmean)

pClustJacOver06

percentage of cluster bootstrap means of Jaccard's index above 0.6

separationIndex

fpc::cluster.stats$sindex

avgDistWithin

fpc::cluster.stats$average.within

withinVsBetween

fpc::cluster.stats$wb.ratio


VZoche-Golob/ClusterTools documentation built on April 3, 2022, 6:52 a.m.