clonalDiversity: Calculate the clonal diversity for samples or groupings

View source: R/clonalDiversity.R

clonalDiversityR Documentation

Calculate the clonal diversity for samples or groupings

Description

This function calculates traditional measures of diversity - Shannon, inverse Simpson, normalized entropy, Gini-Simpson, Chao1 index, and abundance-based coverage estimators (ACE) measure of species evenness by sample or group. The function automatically down samples the diversity metrics using 100 boot straps (n.boots = 100) and outputs the mean of the values. The group parameter can be used to condense the individual samples. If a matrix output for the data is preferred, set exportTable = TRUE.

Usage

clonalDiversity(
  input.data,
  cloneCall = "strict",
  chain = "both",
  group.by = NULL,
  x.axis = NULL,
  metrics = c("shannon", "inv.simpson", "norm.entropy", "gini.simpson", "chao1", "ACE"),
  exportTable = FALSE,
  palette = "inferno",
  n.boots = 100,
  return.boots = FALSE,
  skip.boots = FALSE
)

Arguments

input.data

The product of combineTCR, combineBCR, or combineExpression.

cloneCall

How to call the clone - VDJC gene (gene), CDR3 nucleotide (nt), CDR3 amino acid (aa), VDJC gene + CDR3 nucleotide (strict) or a custom variable in the data.

chain

indicate if both or a specific chain should be used - e.g. "both", "TRA", "TRG", "IGH", "IGL".

group.by

Variable in which to combine for the diversity calculation.

x.axis

Additional variable grouping that will space the sample along the x-axis.

metrics

The indices to use in diversity calculations - "shannon", "inv.simpson", "norm.entropy", "gini.simpson", "chao1", "ACE".

exportTable

Exports a table of the data into the global environment in addition to the visualization.

palette

Colors to use in visualization - input any hcl.pals.

n.boots

number of bootstraps to down sample in order to get mean diversity.

return.boots

export boot strapped values calculated - will automatically exportTable = TRUE.

skip.boots

remove down sampling and boot strapping from the calculation.

Details

The formulas for the indices and estimators are as follows:

Shannon Index:

Index = - \sum p_i * \log(p_i)

Inverse Simpson Index:

Index = \frac{1}{(\sum_{i=1}^{S} p_i^2)}

Normalized Entropy:

Index = -\frac{\sum_{i=1}^{S} p_i \ln(p_i)}{\ln(S)}

Gini-Simpson Index:

Index = 1 - \sum_{i=1}^{S} p_i^2

Chao1 Index:

Index = S_{obs} + \frac{n_1(n_1-1)}{2*n_2+1}

Abundance-based Coverage Estimator (ACE):

Index = S_{abund} + \frac{S_{rare}}{C_{ace}} + \frac{F_1}{C_{ace}}

Where:

  • p_i is the proportion of species i in the dataset.

  • S is the total number of species.

  • n_1 and n_2 are the number of singletons and doubletons, respectively.

  • S_{abund}, S_{rare}, C_{ace}, and F_1 are parameters derived from the data.

Value

ggplot of the diversity of clones by group

Author(s)

Andrew Malone, Nick Borcherding

Examples

#Making combined contig data
combined <- combineTCR(contig_list, 
                        samples = c("P17B", "P17L", "P18B", "P18L", 
                                    "P19B","P19L", "P20B", "P20L"))
clonalDiversity(combined, cloneCall = "gene")


ncborcherding/scRepertoire documentation built on April 7, 2024, 12:44 a.m.