rarefyAssay: Subsample Counts
In FelixErnst/mia: Microbiome analysis

rarefyAssay

R Documentation

Subsample Counts

Description

rarefyAssay randomly subsamples counts within a SummarizedExperiment object and returns a new SummarizedExperiment containing the original assay and the new subsampled assay.

Usage

rarefyAssay(x, ...)

## S4 method for signature 'SummarizedExperiment'
rarefyAssay(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  sample = min_size,
  min_size = min(colSums2(assay(x, assay.type))),
  replace = FALSE,
  name = "subsampled",
  ...
)

Arguments

`x`	`TreeSummarizedExperiment`.
`...`	optional arguments: `verbose`: `Logical scalar`. Choose whether to show messages. (Default: `TRUE`)
`assay.type`	`Character scalar`. Specifies the name of assay used in calculation. (Default: `"counts"`)
`assay_name`	Deprecated. Use `assay.type` instead.
`sample`	`Integer scalar`. Indicates the number of counts being simulated i.e. rarefying depth. This can equal to lowest number of total counts found in a sample or a user specified number.
`min_size`	Deprecated. Use `sample` instead.
`replace`	`Logical scalar`. Whether to åperform subsampling with replacement. Ths works similarly to `sample(..., replace = TRUE)`. (Default: `FALSE`)
`name`	`Character scalar`. The name for the transformed assay to be stored. (Default: `method`)

Details

Although the subsampling approach is highly debated in microbiome research, we include the rarefyAssay function because there may be some instances where it can be useful. Note that the output of rarefyAssay is not the equivalent as the input and any result have to be verified with the original dataset.

Subsampling/Rarefying may undermine downstream analyses and have unintended consequences. Therefore, make sure this normalization is appropriate for your data.

To maintain the reproducibility, please define the seed using set.seed() before implement this function.

When replace = FALSE, the function uses internally vegan::rarefy while with replacement enabled the function utilizes own implementation, inspired by phyloseq::rarefy_even_depth.

Value

rarefyAssay return x with subsampled data.

References

McMurdie PJ, Holmes S. Waste not, want not: why rarefying microbiome data is inadmissible. PLoS computational biology. 2014 Apr 3;10(4):e1003531.

Gloor GB, Macklaim JM, Pawlowsky-Glahn V & Egozcue JJ (2017) Microbiome Datasets Are Compositional: And This Is Not Optional. Frontiers in Microbiology 8: 2224. doi: 10.3389/fmicb.2017.02224

Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, Lozupone C, Zaneveld JR, Vázquez-Baeza Y, Birmingham A, Hyde ER. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome. 2017 Dec;5(1):1-8.

Examples

# When samples in TreeSE are less than specified sample, they will be
# removed. If after subsampling features are not present in any of the
# samples, they will be removed.
data(GlobalPatterns)
tse <- GlobalPatterns
set.seed(123)
tse_subsampled <- rarefyAssay(tse, sample = 60000, name = "subsampled")
tse_subsampled
dim(tse)
dim(assay(tse_subsampled, "subsampled"))

FelixErnst/mia documentation built on July 16, 2025, 8:08 p.m.