clonalDiversity: Calculate the clonal diversity for samples or groupings
In ncborcherding/scRepertoire: A toolkit for single-cell immune receptor profiling

clonalDiversity

R Documentation

Calculate the clonal diversity for samples or groupings

Description

This function calculates traditional measures of diversity - Shannon, inverse Simpson, normalized entropy, Gini-Simpson, Chao1 index, and abundance-based coverage estimators (ACE) measure of species evenness by sample or group. The function automatically down samples the diversity metrics using 100 boot straps (n.boots = 100) and outputs the mean of the values. The group parameter can be used to condense the individual samples. If a matrix output for the data is preferred, set exportTable = TRUE.

Usage

clonalDiversity(
  input.data,
  cloneCall = "strict",
  chain = "both",
  group.by = NULL,
  order.by = NULL,
  x.axis = NULL,
  metrics = c("shannon", "inv.simpson", "norm.entropy", "gini.simpson", "chao1", "ACE"),
  exportTable = FALSE,
  palette = "inferno",
  n.boots = 100,
  return.boots = FALSE,
  skip.boots = FALSE
)

Arguments

`input.data`	The product of `combineTCR()`, `combineBCR()`, or `combineExpression()`.
`cloneCall`	How to call the clone - VDJC gene (gene), CDR3 nucleotide (nt), CDR3 amino acid (aa), VDJC gene + CDR3 nucleotide (strict) or a custom variable in the data
`chain`	indicate if both or a specific chain should be used - e.g. "both", "TRA", "TRG", "IGH", "IGL"
`group.by`	Variable in which to combine for the diversity calculation
`order.by`	A vector of specific plotting order or "alphanumeric" to plot groups in order
`x.axis`	Additional variable grouping that will space the sample along the x-axis
`metrics`	The indices to use in diversity calculations - "shannon", "inv.simpson", "norm.entropy", "gini.simpson", "chao1", "ACE"
`exportTable`	Exports a table of the data into the global environment in addition to the visualization
`palette`	Colors to use in visualization - input any hcl.pals
`n.boots`	number of bootstraps to down sample in order to get mean diversity
`return.boots`	export boot strapped values calculated - will automatically exportTable = TRUE.
`skip.boots`	remove down sampling and boot strapping from the calculation.

Details

The formulas for the indices and estimators are as follows:

Shannon Index:

Index = - \sum p_i * \log(p_i)

Inverse Simpson Index:

Index = \frac{1}{(\sum_{i=1}^{S} p_i^2)}

Normalized Entropy:

Index = -\frac{\sum_{i=1}^{S} p_i \ln(p_i)}{\ln(S)}

Gini-Simpson Index:

Index = 1 - \sum_{i=1}^{S} p_i^2

Chao1 Index:

Index = S_{obs} + \frac{n_1(n_1-1)}{2*n_2+1}

Abundance-based Coverage Estimator (ACE):

Index = S_{abund} + \frac{S_{rare}}{C_{ace}} + \frac{F_1}{C_{ace}}

Where:

p_i is the proportion of species i in the dataset.
S is the total number of species.
n_1 and n_2 are the number of singletons and doubletons, respectively.
S_{abund}, S_{rare}, C_{ace}, and F_1 are parameters derived from the data.

Value

ggplot of the diversity of clones by group

Author(s)

Andrew Malone, Nick Borcherding

Examples

#Making combined contig data
combined <- combineTCR(contig_list, 
                        samples = c("P17B", "P17L", "P18B", "P18L", 
                                    "P19B","P19L", "P20B", "P20L"))
clonalDiversity(combined, cloneCall = "gene")

ncborcherding/scRepertoire documentation built on June 9, 2025, 1:42 p.m.