calcARI: Calculate adjusted Rand index (ARI) by comparing two cluster...
In rliger: Linked Inference of Genomic Experimental Relationships

calcARI

R Documentation

Calculate adjusted Rand index (ARI) by comparing two cluster labeling variables

Description

This function aims at calculating the adjusted Rand index for the clustering result obtained with LIGER and the external clustering (existing "true" annotation). ARI ranges from 0 to 1, with a score of 0 indicating no agreement between clusterings and 1 indicating perfect agreement.

The true clustering annotation must be specified as the base line. We suggest setting it to the object cellMeta so that it can be easily used for many other visualization and evaluation functions.

The ARI can be calculated for only specified datasets, since true annotation might not be available for all datasets. Evaluation for only one or a few datasets can be done by specifying useDatasets. If useDatasets is specified, the argument checking for trueCluster and useCluster will be enforced to match the cells in the specified datasets.

Usage

calcARI(
  object,
  trueCluster,
  useCluster = NULL,
  useDatasets = NULL,
  verbose = getOption("ligerVerbose", TRUE),
  classes.compare = trueCluster
)

Arguments

`object`	A liger object, with the clustering result present in cellMeta.
`trueCluster`	Either the name of one variable in `cellMeta(object)` or a factor object with annotation that matches with all cells being considered.
`useCluster`	The name of one variable in `cellMeta(object)`. Default `NULL` uses default clusters.
`useDatasets`	A character vector of the names, a numeric or logical vector of the index of the datasets to be considered for the purity calculation. Default `NULL` uses all datasets.
`verbose`	Logical. Whether to show information of the progress. Default `getOption("ligerVerbose")` or `TRUE` if users have not set.
`classes.compare`	. Use `trueCluster` instead.

Value

A numeric scalar, the ARI of the clustering result indicated by useCluster compared to trueCluster.

A numeric scalar of the ARI value

References

L. Hubert and P. Arabie (1985) Comparing Partitions, Journal of the Classification, 2, pp. 193-218.

Examples

# Assume the true cluster in `pbmcPlot` is "leiden_cluster"
# generate fake new labeling
fake <- sample(1:7, ncol(pbmcPlot), replace = TRUE)
# Insert into cellMeta
pbmcPlot$new <- factor(fake)
calcARI(pbmcPlot, trueCluster = "leiden_cluster", useCluster = "new")

# Now assume we got existing base line annotation only for "stim" dataset
nStim <- ncol(dataset(pbmcPlot, "stim"))
stimTrueLabel <- factor(fake[1:nStim])
# Insert into cellMeta
cellMeta(pbmcPlot, "stim_true_label", useDatasets = "stim") <- stimTrueLabel
# Assume "leiden_cluster" is the clustering result we got and need to be
# evaluated
calcARI(pbmcPlot, trueCluster = "stim_true_label",
        useCluster = "leiden_cluster", useDatasets = "stim")

# Comparison of the same labeling should always yield 1.
calcARI(pbmcPlot, trueCluster = "leiden_cluster", useCluster = "leiden_cluster")

rliger documentation built on Aug. 27, 2025, 1:08 a.m.