randIndex | R Documentation |
Calculates Rand type Indices to compare two partitions
randIndex(c1, c2 = NULL, noisecluster = NULL)
c1 |
labels of the first partition or contingency table. A numeric vector or factor containining the class labels of the first partition or a 2-dimensional numeric matrix which contains the cross-tabulation of cluster assignments. |
c2 |
labels of the second partition. A numeric vector or a factor
containining the class labels of the second partition. The length of
the vector |
noisecluster |
label or number associated to the 'noise class' or 'noise level'. Number or character label which denotes the points which do not belong to any cluster. These points are not takern into account for the computation of the Rand type indexes. The default is to consider all points. |
A list with Rand type indexes:
AR Adjusted Rand index. A number between -1 and 1. The adjusted Rand index is the corrected-for-chance version of the Rand index.
RI Rand index (unadjusted). A number between 0 and 1. Rand index computes the fraction of pairs of objects for which both classification methods agree. RI ranges from 0 (no pair classified in the same way under both clusterings) to 1 (identical clusterings).
MI Mirkin's index. A number between 0 and 1. Mirkin's index computes
the percentage of pairs of objects for which both classification
methods disagree. MI=1-RI
.
HI Hubert index. A number between -1 and 1. HI index is equal to the
fraction of pairs of objects for which both classification methods
agree minus the fraction of pairs of objects for which both classification
methods disagree. HI= RI-MI
.
## 1. randindex with the contingency table as input.
T <- matrix(c(1, 1, 0, 1, 2, 1, 0, 0, 4), nrow=3)
(ARI <- randIndex(T))
## 2. randindex with the two vectors as input.
c <- matrix(c(1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3), ncol=2, byrow=TRUE)
## c1 = numeric vector containing the labels of the first partition
c1 <- c[,1]
## c2 = numeric vector containing the labels of the second partition
c2 <- c[,2]
(ARI <- randIndex(c1,c2))
## 3. Compare ARI for iris data (true classification against tclust classification)
library(tclust)
c1 <- iris$Species # first partition c1 is the true partition
out <- tclust(iris[, 1:4], k=3, alpha=0, restr.fact=100)
c2 <- out$cluster # second partition c2 is the output of tclust clustering procedure
randIndex(c1,c2)
## 4. Compare ARI for iris data (exclude unassigned units from tclust).
c1 <- iris$Species # first partition c1 is the true partition
out <- tclust(iris[,1:4], k=3, alpha=0.1, restr.fact=100)
c2 <- out$cluster # second partition c2 is the output of tclust clustering procedure
## Units inside c2 which contain number 0 are referred to trimmed observations
noisecluster <- 0
randIndex(c1, c2, noisecluster=0)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.