calculateUnifrac: Calculate weighted or unweighted (Fast) Unifrac distance

calculateUnifracR Documentation

Calculate weighted or unweighted (Fast) Unifrac distance

Description

This function calculates the (Fast) Unifrac distance for all sample-pairs in a TreeSummarizedExperiment object.

Usage

calculateUnifrac(x, tree, ...)

## S4 method for signature 'ANY,phylo'
calculateUnifrac(
  x,
  tree,
  weighted = FALSE,
  normalized = TRUE,
  BPPARAM = SerialParam(),
  ...
)

## S4 method for signature 'TreeSummarizedExperiment,missing'
calculateUnifrac(
  x,
  assay.type = assay_name,
  assay_name = exprs_values,
  exprs_values = "counts",
  tree_name = "phylo",
  transposed = FALSE,
  ...
)

runUnifrac(
  x,
  tree,
  weighted = FALSE,
  normalized = TRUE,
  nodeLab = NULL,
  BPPARAM = SerialParam(),
  ...
)

Arguments

x

a numeric matrix or a TreeSummarizedExperiment object containing a tree.

Please note that runUnifrac expects a matrix with samples per row and not per column. This is implemented to be compatible with other distance calculations such as dist as much as possible.

tree

if x is a matrix, a phylo object matching the matrix. This means that the phylo object and the columns should relate to the same type of features (aka. microorganisms).

...

optional arguments not used.

weighted

TRUE or FALSE: Should use weighted-Unifrac calculation? Weighted-Unifrac takes into account the relative abundance of species/taxa shared between samples, whereas unweighted-Unifrac only considers presence/absence. Default is FALSE, meaning the unweighted-Unifrac distance is calculated for all pairs of samples.

normalized

TRUE or FALSE: Should the output be normalized such that values range from 0 to 1 independent of branch length values? Default is TRUE. Note that (unweighted) Unifrac is always normalized by total branch-length, and so this value is ignored when weighted == FALSE.

BPPARAM

A BiocParallelParam object specifying whether the Unifrac calculation should be parallelized.

assay.type

a single character value for specifying which assay to use for calculation.

assay_name

a single character value for specifying which assay to use for calculation. (Please use assay.type instead. At some point assay_name will be disabled.)

exprs_values

a single character value for specifying which assay to use for calculation. (Please use assay.type instead.)

tree_name

a single character value for specifying which tree will be used in calculation. (By default: tree_name = "phylo")

transposed

Logical scalar, is x transposed with cells in rows, i.e., is Unifrac distance calculated based on rows (FALSE) or columns (TRUE). (By default: transposed = FALSE)

nodeLab

if x is a matrix, a character vector specifying links between rows/columns and tips of tree. The length must equal the number of rows/columns of x. Furthermore, all the node labs must be present in tree.

Details

Please note that if calculateUnifrac is used as a FUN for runMDS, the argument ntop has to be set to nrow(x).

Value

a sample-by-sample distance matrix, suitable for NMDS, etc.

Author(s)

Paul J. McMurdie. Adapted for mia by Felix G.M. Ernst

References

http://bmf.colorado.edu/unifrac/

The main implementation (Fast Unifrac) is adapted from the algorithm's description in:

Hamady, Lozupone, and Knight, “Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data.” The ISME Journal (2010) 4, 17–27.

See also additional descriptions of Unifrac in the following articles:

Lozupone, Hamady and Knight, “Unifrac - An Online Tool for Comparing Microbial Community Diversity in a Phylogenetic Context.”, BMC Bioinformatics 2006, 7:371

Lozupone, Hamady, Kelley and Knight, “Quantitative and qualitative (beta) diversity measures lead to different insights into factors that structure microbial communities.” Appl Environ Microbiol. 2007

Lozupone C, Knight R. “Unifrac: a new phylogenetic method for comparing microbial communities.” Appl Environ Microbiol. 2005 71 (12):8228-35.

Examples

data(esophagus)
library(scater)
calculateUnifrac(esophagus, weighted = FALSE)
calculateUnifrac(esophagus, weighted = TRUE)
calculateUnifrac(esophagus, weighted = TRUE, normalized = FALSE)
# for using calculateUnifrac in conjunction with runMDS the tree argument
# has to be given separately. In addition, subsetting using ntop must
# be disabled
esophagus <- runMDS(esophagus, FUN = calculateUnifrac, name = "Unifrac",
                    tree = rowTree(esophagus),
                    exprs_values = "counts",
                    ntop = nrow(esophagus))
reducedDim(esophagus)

microbiome/mia documentation built on April 27, 2024, 4:04 a.m.