Please note that the authors of phyloseq do not advocate using this
as a normalization procedure, despite its recent popularity.
Our justifications for using alternative approaches to address
disparities in library sizes have been made available as
an article in PLoS Computational Biology.
phyloseq_to_deseq2 for a recommended alternative to rarefying
directly supported in the phyloseq package, as well as
the supplemental materials for the PLoS-CB article
and the phyloseq extensions repository on GitHub.
Nevertheless, for comparison and demonstration, the rarefying procedure is implemented
here in good faith and with options we hope are useful.
This function uses the standard R
sample function to
resample from the abundance values
otu_table component of the first argument,
Often one of the major goals of this procedure is to achieve parity in
total number of counts between samples, as an alternative to other formal
normalization procedures, which is why a single value for the
sample.size is expected.
This kind of resampling can be performed with and without replacement,
with replacement being the more computationally-efficient, default setting.
replace parameter documentation for more details.
We recommended that you explicitly select a random number generator seed
before invoking this function, or, alternatively, that you
explicitly provide a single positive integer argument as
(Optional). A single integer value equal to the number
of reads being simulated, also known as the depth,
and also equal to each value returned by
(Optional). A single integer value passed to
(Optional). Logical. Whether to sample with replacement
(Optional). Logical. Default is
This approach is sometimes mistakenly called “rarefaction”, which
in physics refers to a form of wave decompression;
but in this context, ecology, the term refers to a
repeated sampling procedure to assess species richness,
first proposed in 1968 by Howard Sanders.
In contrast, the procedure implemented here is used as an ad hoc means to
normalize microbiome counts that have
resulted from libraries of widely-differing sizes.
Here we have intentionally adopted an alternative
rarefy, that has also been used recently
to describe this process
and, to our knowledge, not previously used in ecology.
Make sure to use
set.seed for exactly-reproducible results
of the random subsampling.
An object of class
otu_table component is modified.
1 2 3 4 5 6 7 8 9 10 11 12
# Test with esophagus dataset data("esophagus") esorepT = rarefy_even_depth(esophagus, replace=TRUE) esorepF = rarefy_even_depth(esophagus, replace=FALSE) sample_sums(esophagus) sample_sums(esorepT) sample_sums(esorepF) ## NRun Manually: Too slow! # data("GlobalPatterns") # GPrepT = rarefy_even_depth(GlobalPatterns, 1E5, replace=TRUE) ## Actually just this one is slow # system.time(GPrepF <- rarefy_even_depth(GlobalPatterns, 1E5, replace=FALSE))
You set `rngseed` to FALSE. Make sure you've set & recorded the random seed of your session for reproducibility. See `?set.seed` ... 8OTUs were removed because they are no longer present in any sample after random subsampling ... You set `rngseed` to FALSE. Make sure you've set & recorded the random seed of your session for reproducibility. See `?set.seed` ... B C D 203 255 219 B C D 203 203 203 B C D 203 203 203 Warning message: system call failed: Cannot allocate memory
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.