compareClusterings: Compare pairs of clusterings

View source: R/compareClusterings.R

compareClusteringsR Documentation

Compare pairs of clusterings

Description

Compute the adjusted Rand index between all pairs of clusterings, where larger values indicate a greater similarity between clusterings.

Usage

compareClusterings(clusters, adjusted = TRUE)

Arguments

clusters

A list of factors or vectors where each entry corresponds to a clustering. All vectors should be of the same length. The list itself should usually be named with a suitable label for each clustering.

adjusted

Logical scalar indicating whether the adjusted Rand index should be returned.

Details

The aim of this function is to allow us to easily determine the relationships between clusterings. For example, we might use this to determine which parameter settings have the greatest effect in a sweep by clusterSweep. Alternatively, we could use this to obtain an “ordering” of clusterings for visualization, e.g., with clustree.

This function does not provide any insight into the relationships between individual clusters. A large Rand index only means that two clusterings are similar but does not specify the corresponding set of clusters across clusterings. For that task, we suggest using the linkClusters function instead.

Value

A symmetric square matrix of pairwise (adjusted) Rand indices between all pairs of clusters.

Aaron Lun

See Also

linkClusters, which identifies relationships between individual clusters across clusterings.

pairwiseRand, for calculation of the pairwise Rand index.

Examples

clusters <- list(
    nngraph = clusterRows(iris[,1:4], NNGraphParam()),
    hclust = clusterRows(iris[,1:4], HclustParam(cut.dynamic=TRUE)),
    kmeans = clusterRows(iris[,1:4], KmeansParam(20))
)

aris <- compareClusterings(clusters)

# Visualizing the relationships between clusterings.
# Here, k-means is forced to be least similar to the others.
ari.as.graph <- igraph::graph.adjacency(aris, mode="undirected", weighted=TRUE)
plot(ari.as.graph)

# Obtain an ordering of clusterings, using the eigenvector 
# as a 1-dimensional summary of the matrix:
ev1 <- eigen(aris)$vectors[,1]
o <- order(ev1)
rownames(aris)[o]


LTLA/bluster documentation built on Sept. 8, 2024, 4:37 a.m.