Write, Analyze, and Visualize 'BIOM' Data

taxa_corrplot

R Documentation

Visualize taxa abundance with scatterplots and trendlines.

Description

Visualize taxa abundance with scatterplots and trendlines.

Usage

taxa_corrplot(
  biom,
  x,
  rank = -1,
  layers = "tc",
  taxa = 6,
  lineage = FALSE,
  unc = "singly",
  other = FALSE,
  stat.by = NULL,
  facet.by = NULL,
  colors = TRUE,
  shapes = TRUE,
  test = "emmeans",
  fit = "gam",
  at = NULL,
  level = 0.95,
  p.adj = "fdr",
  transform = "none",
  ties = "random",
  seed = 0,
  alt = "!=",
  mu = 0,
  caption = TRUE,
  check = FALSE,
  ...
)

Arguments

`biom`	An rbiom object, such as from `as_rbiom()`. Any value accepted by `as_rbiom()` can also be given here.
`x`	Dataset field with the x-axis values. Equivalent to the `regr` argument in `stats_table()`. Required.
`rank`	What rank(s) of taxa to display. E.g. `"Phylum"`, `"Genus"`, `".otu"`, etc. An integer vector can also be given, where `1` is the highest rank, `2` is the second highest, `-1` is the lowest rank, `-2` is the second lowest, and `0` is the OTU "rank". Run `biom$ranks` to see all options for a given rbiom object. Default: `-1`.
`layers`	One or more of `c("trend", "confidence", "point", "name", "residual")`. Single letter abbreviations are also accepted. For instance, `c("trend", "point")` is equivalent to `c("t", "p")` and `"tp"`. Default: `"tc"`
`taxa`	Which taxa to display. An integer value will show the top n most abundant taxa. A value 0 <= n < 1 will show any taxa with that mean abundance or greater (e.g. `0.1` implies >= 10%). A character vector of taxa names will show only those named taxa. Default: `6`.
`lineage`	Include all ranks in the name of the taxa. For instance, setting to `TRUE` will produce `⁠Bacteria; Actinobacteria; Coriobacteriia; Coriobacteriales⁠`. Otherwise the taxa name will simply be `Coriobacteriales`. You want to set this to TRUE when `unc = "asis"` and you have taxa names (such as Incertae_Sedis) that map to multiple higher level ranks. Default: `FALSE`
`unc`	How to handle unclassified, uncultured, and similarly ambiguous taxa names. Options are: `"singly"` - Replaces them with the OTU name. `"grouped"` - Replaces them with a higher rank's name. `"drop"` - Excludes them from the result. `"asis"` - To not check/modify any taxa names. Abbreviations are allowed. Default: `"singly"`
`other`	Sum all non-itemized taxa into an "Other" taxa. When `FALSE`, only returns taxa matched by the `taxa` argument. Specifying `TRUE` adds "Other" to the returned set. A string can also be given to imply `TRUE`, but with that value as the name to use instead of "Other". Default: `FALSE`
`stat.by`	Dataset field with the statistical groups. Must be categorical. Default: `NULL`
`facet.by`	Dataset field(s) to use for faceting. Must be categorical. Default: `NULL`
`colors`	How to color the groups. Options are: `TRUE` - Automatically select colorblind-friendly colors. `FALSE` or `NULL` - Don't use colors. a palette name - Auto-select colors from this set. E.g. `"okabe"` character vector - Custom colors to use. E.g. `c("red", "#00FF00")` named character vector - Explicit mapping. E.g. `c(Male = "blue", Female = "red")` See "Aesthetics" section below for additional information. Default: `TRUE`
`shapes`	Shapes for each group. Options are similar to `colors`'s: `TRUE`, `FALSE`, `NULL`, shape names (typically integers 0 - 17), or a named vector mapping groups to specific shape names. See "Aesthetics" section below for additional information. Default: `TRUE`
`test`	Method for computing p-values: `'none'`, `'emmeans'`, or `'emtrends'`. Default: `'emmeans'`
`fit`	How to fit the trendline. `'lm'`, `'log'`, or `'gam'`. Default: `'gam'`
`at`	Position(s) along the x-axis where the means or slopes should be evaluated. Default: `NULL`, which samples 100 evenly spaced positions and selects the position where the p-value is most significant.
`level`	The confidence level for calculating a confidence interval. Default: `0.95`
`p.adj`	Method to use for multiple comparisons adjustment of p-values. Run `p.adjust.methods` for a list of available options. Default: `"fdr"`
`transform`	Transformation to apply. Options are: `c("none", "rank", "log", "log1p", "sqrt", "percent")`. `"rank"` is useful for correcting for non-normally distributions before applying regression statistics. Default: `"none"`
`ties`	When `transform="rank"`, how to rank identical values. Options are: `c("average", "first", "last", "random", "max", "min")`. See `rank()` for details. Default: `"random"`
`seed`	Random seed for permutations. Must be a non-negative integer. Default: `0`
`alt`	Alternative hypothesis direction. Options are `'!='` (two-sided; not equal to `mu`), `'<'` (less than `mu`), or `'>'` (greater than `mu`). Default: `'!='`
`mu`	Reference value to test against. Default: `0`
`caption`	Add methodology caption beneath the plot. Default: `TRUE`
`check`	Generate additional plots to aid in assessing data normality. Default: `FALSE`
`...`	Additional parameters to pass along to ggplot2 functions. Prefix a parameter name with a layer name to pass it to only that layer. For instance, `p.size = 2` ensures only the points have their size set to `2`.

Value

A ggplot2 plot. The computed data points, ggplot2 command, stats table, and stats table commands are available as ⁠$data⁠, ⁠$code⁠, ⁠$stats⁠, and ⁠$stats$code⁠, respectively.

Aesthetics

All built-in color palettes are colorblind-friendly. The available categorical palette names are: "okabe", "carto", "r4", "polychrome", "tol", "bright", "light", "muted", "vibrant", "tableau", "classic", "alphabet", "tableau20", "kelly", and "fishy".

Shapes can be given as per base R - numbers 0 through 17 for various shapes, or the decimal value of an ascii character, e.g. a-z = 65:90; A-Z = 97:122 to use letters instead of shapes on the plot. Character strings may used as well.

Examples

    library(rbiom) 
    
    biom <- rarefy(subset(hmp50, `Body Site` %in% c('Buccal mucosa', 'Saliva')))
    taxa_corrplot(biom, x = "BMI", stat.by = "Body Site", taxa = 'Streptococcus')

cmmr/rbiom documentation built on June 10, 2025, 9:24 p.m.