JS_desponds: Compute the Jensen-Shannon distance between two fitted...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/comparative_analysis.R

Description

After the Desponds et al. (2016) model havs been fit to your samples, the pairwise JS distance can be computed between them. This function takes the fitted model parameters from 2 distributions and computes the JS distance between them. When all pairwise distances have been computed, they can be used to do hierarchical clustering. This function assumes you denote one distribution as P and one as Q.

Usage

1
JS_desponds(grid, Cminp, Cminq, alphap, alphaq)

Arguments

grid

Vector of integers over which to compute the JS distance. The minimum of the grid is ideally the minimum count of all samples being compared. The maximum is ideally something very large (e.g. 100,000) in order to all or nearly all of the model density. The grid should include every integer in its range. See Examples.

Cminp

The estimated threshold for distribution P.

Cminq

The estimated threshold for distribution Q.

alphap

The estimated parameter alpha for distribution P.

alphaq

The estimated parameter alpha for distribution Q.

Details

For 2 discrete distributions P and Q, the Jensen-Shannon distance between them is

JSD(P,Q) =√ .5 * \int [P(t) log P(t)/M(t)] + \int [Q(t) log Q(t)/M(t)] dt

where

M(t) = .5 * (P(t) + Q(t)).

Value

The function directly returns the Jensen-Shannon distance between two fitted distributions P and Q.

Author(s)

hbk5086@psu.edu

References

Desponds, Jonathan, Thierry Mora, and Aleksandra M. Walczak. "Fluctuating fitness shapes the clone-size distribution of immune repertoires." Proceedings of the National Academy of Sciences 113.2 (2016): 274-279. APA

See Also

JS_spliced, JS_dist

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
data("repertoires")

# Fit the discrete gamma-gpd spliced model at some selected threshold on 2 samples
fit1 <- fdesponds(repertoires[[1]])
fit2 <- fdesponds(repertoires[[2]])

# Create a grid of every integer from the minimum threshold to a large value
# When comparing many distributions in advance of clustering,
# the same grid should be used across every comparison
# The chosen "large value" here is only 1,000, for the sake of quick computation.
# Ideally, the large value will be at least 100,000
grid <- min(c(fit1['Cmin'], fit2['Cmin'])):1000

# Compute the Jensen-Shannon distance between fit1 and fit2
dist <- JS_desponds(grid,
                    Cminp = fit1['Cmin'],
                    Cminq = fit2['Cmin'],
                    alphap = fit1['pareto.alpha'],
                    alphaq = fit2['pareto.alpha'])

dist

hillarykoch/powerTCR documentation built on March 17, 2021, 8:05 p.m.