pldist: pldist

Description Usage Arguments Value Examples

View source: R/pldist.R

Description

Function that calculates paired and longitudinal ecological distance/dissimilarity matrices. Includes qualitative and quantitative versions of Bray-Curtis, Jaccard, Kulczynski, Gower, and unweighted and generalized UniFrac distances/dissimilarities. UniFrac-based metrics are based in part on GUniFrac (Jun Chen & Hongzhe Li (2012)).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
pldist(
  otus,
  metadata,
  paired = FALSE,
  binary = FALSE,
  clr = FALSE,
  pseudoct = NULL,
  method,
  tree = NULL,
  gam = c(0, 0.5, 1),
  norm = FALSE
)

Arguments

otus

OTU count or frequency table, containing one row per sample and one column per OTU.

metadata

Data frame with three columns: subject identifiers (n unique values, column name "subjID"), sample identifiers (must match row names of otu.tab, column name "sampID"), and time point or group identifier (if using longitudinal distances, this must be numeric or convertable to numeric).

paired

Logical indicating whether to use the paired version of the metric (TRUE) or the longitudinal version (FALSE). Paired analyis is only possible when there are exactly 2 unique time points/identifiers for each subject or pair.

binary

Logical indicating whether to use the qualitative (TRUE) or quantitative (FALSE) version of each metric. Qualitative analysis only incorporates changes in OTU presence or absence; quantitative analysis incorporates changes in abundance.

clr

Logical indicating whether to use CLR-transformed abundances (TRUE) or original proportions (FALSE). Default FALSE.

pseudoct

Pseudocount value to be added to each cell of the matrix prior to CLR transformation. Default is NULL; if NULL, 0.5 will be added if data are counts, min(1e-06, 0.5*min(nonzero p)) will be added if data are proportions, and nothing will be added if no cells have zero values.

method

Desired distance metric. Choices are braycurtis, jaccard, kulczynski, gower, and unifrac, or any unambiguous abbreviation thereof.

tree

Rooted phylogenetic tree of R class "phylo". Default NULL; only needed for UniFrac family distances.

gam

Parameter controlling weight on abundant lineages for UniFrac family distances. The same weight is used within a subject as between subjects. Default (0, 0.5, 1).

norm

Indicator of whether to normalize the difference to average taxon abundance or not (default FALSE)

Value

Returns a list with elements:

D

If any metric other than UniFrac is used, D is an n x n distance (or dissimilarity) matrix. For UniFrac-family dissimilarities, D is a (K+1) dimensional array containing the paired or longitudinal UniFrac dissimilarities with the K specified gamma values plus the unweighted distance. The unweighted distance matrix may be accessed by result[,,"d_UW"], and the generalized dissimilarities by result[,,"d_G"] where G is the particular choice of gamma.

type

String indicating what type of dissimilarity was requested.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Gower distance, paired & quantitative transformation 
pldist(paired.otus, paired.meta, paired = TRUE, binary = FALSE, method = "gower")$D

# Gower distance, paired & qualitative/binary transformation 
pldist(paired.otus, paired.meta, paired = TRUE, binary = TRUE, method = "gower")$D

# Gower distance, longitudinal & quantitative transformation 
pldist(bal.long.otus, bal.long.meta, paired = FALSE, binary = FALSE, method = "gower")$D

# Gower distance, longitudinal & qualitative/binary transformation 
pldist(bal.long.otus, bal.long.meta, paired = FALSE, binary = TRUE, method = "gower")$D

# Other distances 
pldist(paired.otus, paired.meta, paired = TRUE, binary = FALSE, method = "bray")$D
pldist(paired.otus, paired.meta, paired = TRUE, binary = FALSE, method = "kulczynski")$D
pldist(paired.otus, paired.meta, paired = TRUE, binary = FALSE, method = "jaccard")$D

# UniFrac additionally requires a phylogenetic tree and gamma values 
# (Gamma controls weight placed on abundant lineages) 
pldist(paired.otus, paired.meta, paired = TRUE, binary = FALSE, 
    method = "unifrac", tree = sim.tree, gam = c(0, 0.5, 1), norm = FALSE)$D 
    

aplantin/pldist documentation built on Feb. 26, 2021, 2:19 p.m.