airr_public: Public indices - pairwise repertoire overlap

airr_publicR Documentation

Public indices - pairwise repertoire overlap

Description

[Experimental]

A family of functions to quantify public or shared receptors between repertoire.

Available functions

Supported methods are the following.

airr_public_intersection - number of shared receptors between each pair of repertoires (intersection size). Handy for quick overlap heatmaps, QC of replicate similarity, or spotting donor-shared "public" clonotypes.

airr_public_jaccard - Jaccard similarity of receptor sets between repertoires (A \cap B / A \cup B). Best when comparing cohorts with different sizes to get a scale-invariant overlap score.

Usage

airr_public_intersection(
  idata,
  autojoin = getOption("immundata.autojoin", TRUE),
  format = c("long", "wide")
)

airr_public_jaccard(
  idata,
  autojoin = getOption("immundata.autojoin", TRUE),
  format = c("long", "wide")
)

Arguments

idata

An ImmunData object.

autojoin

Logical. If TRUE, join repertoire metadata by the schema repertoire id. Change the default behaviour by calling options(immunarch.autojoin = FALSE).

format

String. One of "long" ("long" tibble with imd_repertoire_id, facet columns, and value; useful for visualizations) or "wide" (wide/unmelted table of features, with each row corresponding to a specific repertoire / pair of repertoires; useful for Machine Learning).

Value

airr_public_intersection

A symmetric numeric matrix where rows/columns are repertoire_id and each cell is the count of shared unique receptors. The diagonal contains per-repertoire richness (total unique receptors). Row/column names are repertoire IDs.

airr_public_jaccard

A symmetric numeric matrix where rows/columns are repertoire_id and each cell is the Jaccard similarity in ⁠[0, 1]⁠. The diagonal is 1. Row/column names are repertoire IDs.

See Also

immundata::ImmunData

Examples

# Limit the number of threads used by the underlying DB for this session.
# Change this only if you know what you're doing (e.g., multi-user machines, shared CI/servers).
db_exec("SET threads TO 1")
# Load data
immdata <- get_test_idata() |> agg_repertoires("Therapy")

#
# airr_public_intersection
#
## Not run: 
m_pub <- airr_public_intersection(immdata)

## End(Not run)

#
# airr_public_jaccard
#
## Not run: 
m_jac <- airr_public_jaccard(immdata)

## End(Not run)


immunarch documentation built on Nov. 5, 2025, 7:21 p.m.