comembership_table: Calculates the 2x2 contingency table of agreements and...
In clusteval: Evaluation of Clustering Algorithms

Description Usage Arguments Details Value References Examples

For two clusterings of the same data set, this function calculates the 2x2 contingency table of agreements and disagreements of the corresponding two vectors of comemberships. Basically, the comembership is defined as the pairs of observations that are clustered together.

1	comembership_table(labels1, labels2)

`labels1`	a vector of `n` clustering labels
`labels2`	a vector of `n` clustering labels

The contingency table calculated is typically utilized in the calculation of a similarity statistic (e.g., Rand index, Jaccard index) between the two clusterings. The 2x2 contingency table consists of the following four cells:

n_11: the number of observation pairs where both observations are comembers in both clusterings
n_10: the number of observation pairs where the observations are comembers in the first clustering but not the second
n_01: the number of observation pairs where the observations are comembers in the second clustering but not the first
n_00: the number of observation pairs where neither pair are comembers in either clustering

Tibshirani and Walther (2005) use the term 'co-membership', which we shorten to 'comembership'. Some authors instead use the terms 'connectivity' or 'co-occurrence'.

We use the Rcpp package to improve the runtime speed of this function.

named list containing the calculated contingency table:

n_11
n_10
n_01
n_00

Tibshirani, R. and Walther, G. (2005). Cluster Validation by Prediction Strength. Journal of Computational and Graphical Statistics, 14, 3, 511-528. http://amstat.tandfonline.com/doi/abs/10.1198/106186005X59243.

# We generate K = 3 labels for each of n = 10 observations and compute the
# comembership for all 'n choose 2' pairs.
set.seed(42)
K <- 3
n <- 10
labels1 <- sample.int(K, n, replace = TRUE)
labels2 <- sample.int(K, n, replace = TRUE)
comembership_table(labels1, labels2)

# Here, we cluster the \code{\link{iris}} data set with the K-means and
# hierarchical algorithms using the true number of clusters, K = 3.
# Then, we compute the 2x2 contingency table agreements and disagreements of
#' the comemberships.
iris_kmeans <- kmeans(iris[, -5], centers = 3)$cluster
iris_hclust <- cutree(hclust(dist(iris[, -5])), k = 3)
comembership_table(iris_kmeans, iris_hclust)

$n_11
[1] 3

$n_10
[1] 11

$n_01
[1] 10

$n_00
[1] 21

$n_11
[1] 2867

$n_10
[1] 952

$n_01
[1] 1292

$n_00
[1] 6064

clusteval documentation built on May 2, 2019, 9:18 a.m.

clusteval index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

clusteval
Evaluation of Clustering Algorithms

comembership_table: Calculates the 2x2 contingency table of agreements and...
In clusteval: Evaluation of Clustering Algorithms

Description

Usage

Arguments

Details

Value

References

Examples

Example output

Related to comembership_table in clusteval...

R Package Documentation

Browse R Packages

We want your feedback!

clusteval Evaluation of Clustering Algorithms

comembership_table: Calculates the 2x2 contingency table of agreements and... In clusteval: Evaluation of Clustering Algorithms

Description

Usage

Arguments

Details

Value

References

Examples

Example output

Related to comembership_table in clusteval...

R Package Documentation

Browse R Packages

We want your feedback!

clusteval
Evaluation of Clustering Algorithms

comembership_table: Calculates the 2x2 contingency table of agreements and...
In clusteval: Evaluation of Clustering Algorithms