coassignProb: Compute coassignment probabilities
In scran: Methods for Single-Cell RNA-Seq Data Analysis

Description Usage Arguments Details Value Author(s) See Also Examples

Compute coassignment probabilities for each label in a reference grouping when compared to an alternative grouping of samples. This is now deprecated for pairwiseRand.

1	coassignProb(ref, alt, summarize = FALSE)

`ref`	A character vector or factor containing one set of groupings, considered to be the reference.
`alt`	A character vector or factor containing another set of groupings, to be compared to `alt`.
`summarize`	Logical scalar indicating whether the output matrix should be converted into a per-label summary.

The coassignment probability for each pair of labels in ref is the probability that a randomly chosen cell from each of the two reference labels will have the same label in alt. High coassignment probabilities indicate that a particular pair of labels in ref are frequently assigned to the same label in alt, which has some implications for cluster stability.

When summarize=TRUE, we summarize the matrix of coassignment probabilities into a set of per-label values. The “self” coassignment probability is simply the diagonal entry of the matrix, i.e., the probability that two cells from the same label in ref also have the same label in alt. The “other” coassignment probability is the maximum probability across all pairs involving that label.

In general, ref is well-recapitulated by alt if the diagonal entries of the matrix is much higher than the sum of the off-diagonal entries. This manifests as higher values for the self probabilities compared to the other probabilities.

Note that the coassignment probability is closely related to the Rand index-based ratios broken down by cluster pair in pairwiseRand with mode="ratio" and adjusted=FALSE. The off-diagonal coassignment probabilities are simply 1 minus the off-diagonal ratio, while the on-diagonal values differ only by the lack of consideration of pairs of the same cell in pairwiseRand.

If summarize=FALSE, a numeric matrix is returned with upper triangular entries filled with the coassignment probabilities for each pair of labels in ref.

Otherwise, a DataFrame is returned with one row per label in ref containing the self and other coassignment probabilities.

Aaron Lun

bootstrapCluster, to compute coassignment probabilities across bootstrap replicates.

pairwiseRand, for another way to compare different clusterings.

library(scuttle)
sce <- mockSCE(ncells=200)
sce <- logNormCounts(sce)

clust1 <- kmeans(t(logcounts(sce)),3)$cluster
clust2 <- kmeans(t(logcounts(sce)),5)$cluster

coassignProb(clust1, clust2)
coassignProb(clust1, clust2, summarize=TRUE)