Write, Analyze, and Visualize 'BIOM' Data

bdiv_table

R Documentation

Distance / dissimilarity between samples.

Description

Distance / dissimilarity between samples.

Usage

bdiv_table(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  md = ".all",
  within = NULL,
  between = NULL,
  delta = ".all",
  transform = "none",
  ties = "random",
  seed = 0,
  cpus = NULL
)

bdiv_matrix(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  within = NULL,
  between = NULL,
  transform = "none",
  ties = "random",
  seed = 0,
  cpus = NULL,
  underscores = FALSE
)

bdiv_distmat(
  biom,
  bdiv = "Bray-Curtis",
  weighted = TRUE,
  normalized = TRUE,
  tree = NULL,
  within = NULL,
  between = NULL,
  transform = "none",
  cpus = NULL
)

Arguments

`biom`	An rbiom object, such as from `as_rbiom()`. Any value accepted by `as_rbiom()` can also be given here.
`bdiv`	Beta diversity distance algorithm(s) to use. Options are: `"Bray-Curtis"`, `"Manhattan"`, `"Euclidean"`, `"Jaccard"`, and `"UniFrac"`. For `"UniFrac"`, a phylogenetic tree must be present in `biom` or explicitly provided via `⁠tree=⁠`. Multiple/abbreviated values allowed. Default: `"Bray-Curtis"`
`weighted`	Take relative abundances into account. When `weighted=FALSE`, only presence/absence is considered. Multiple values allowed. Default: `TRUE`
`normalized`	Only changes the "Weighted UniFrac" calculation. Divides result by the total branch weights. Default: `TRUE`
`tree`	A `phylo` object representing the phylogenetic relationships of the taxa in `biom`. Only required when computing UniFrac distances. Default: `biom$tree`
`md`	Dataset field(s) to include in the output data frame, or `'.all'` to include all metadata fields. Default: `'.all'`
`within`, `between`	Dataset field(s) for intra- or inter- sample comparisons. Alternatively, dataset field names given elsewhere can be prefixed with `'=='` or `'!='` to assign them to `within` or `between`, respectively. Default: `NULL`
`delta`	For numeric metadata, report the absolute difference in values for the two samples, for instance `2` instead of `"10 vs 12"`. Default: `TRUE`
`transform`	Transformation to apply. Options are: `c("none", "rank", "log", "log1p", "sqrt", "percent")`. `"rank"` is useful for correcting for non-normally distributions before applying regression statistics. Default: `"none"`
`ties`	When `transform="rank"`, how to rank identical values. Options are: `c("average", "first", "last", "random", "max", "min")`. See `rank()` for details. Default: `"random"`
`seed`	Random seed for permutations. Must be a non-negative integer. Default: `0`
`cpus`	The number of CPUs to use. Set to `NULL` to use all available, or to `1` to disable parallel processing. Default: `NULL`
`underscores`	When parsing the tree, should underscores be kept as is? By default they will be converted to spaces (unless the entire ID is quoted). Default `FALSE`

Value

bdiv_matrix() -: An R matrix of samples x samples.
bdiv_distmat() -: A dist-class distance matrix.
bdiv_table() -: A tibble data.frame with columns names .sample1, .sample2, .weighted, .bdiv, .distance, and any fields requested by md. Numeric metadata fields will be returned as abs(x - y); categorical metadata fields as "x", "y", or "x vs y".

Metadata Comparisons

Prefix metadata fields with == or != to limit comparisons to within or between groups, respectively. For example, stat.by = '==Sex' will run calculations only for intra-group comparisons, returning "Male" and "Female", but NOT "Female vs Male". Similarly, setting stat.by = '!=Body Site' will only show the inter-group comparisons, such as "Saliva vs Stool", "Anterior nares vs Buccal mucosa", and so on.

The same effect can be achieved by using the within and between parameters. stat.by = '==Sex' is equivalent to ⁠stat.by = 'Sex', within = 'Sex'⁠.

Examples

    library(rbiom)
    
    # Subset to four samples
    biom <- hmp50$clone()
    biom$counts <- biom$counts[,c("HMP18", "HMP19", "HMP20", "HMP21")]
    
    # Return in long format with metadata
    bdiv_table(biom, 'unifrac', md = ".all")
    
    # Only look at distances among the stool samples
    bdiv_table(biom, 'unifrac', md = c("==Body Site", "Sex"))
    
    # Or between males and females
    bdiv_table(biom, 'unifrac', md = c("Body Site", "!=Sex"))
    
    # All-vs-all matrix
    bdiv_matrix(biom, 'unifrac')
    
    # All-vs-all distance matrix
    dm <- bdiv_distmat(biom, 'unifrac')
    dm
    plot(hclust(dm))

rbiom documentation built on June 28, 2025, 1:07 a.m.