normalize,phyloseq-method | R Documentation |
It is critical to normalize the feature table to eliminate any bias due to differences in the sampling sequencing depth.This function implements six widely-used normalization methods for microbial compositional data.
For rarefying, reads in the different samples are randomly removed until the same predefined number has been reached, to assure all samples have the same library size. Rarefying normalization method is the standard in microbial ecology. Please note that the authors of phyloseq do not advocate using this rarefying a normalization procedure, despite its recent popularity
TSS simply transforms the feature table into relative abundance by dividing the number of total reads of each sample.
CSS is based on the assumption that the count distributions in each sample are equivalent for low abundant genes up to a certain threshold. Only the segment of each sample’s count distribution that is relatively invariant across samples is scaled by CSS
RLE assumes most features are not differential and uses the relative abundances to calculate the normalization factor.
TMM calculates the normalization factor using a robust statistics based on the assumption that most features are not differential and should, in average, be equal between the samples. The TMM scaling factor is calculated as the weighted mean of log-ratios between each pair of samples, after excluding the highest count OTUs and OTUs with the largest log-fold change.
In CLR, the log-ratios are computed relative to the geometric mean of all features.
norm_cpm
: This normalization method is from the original LEfSe algorithm,
recommended when very low values are present (as shown in the LEfSe galaxy).
## S4 method for signature 'phyloseq'
normalize(object, method = "TSS", ...)
## S4 method for signature 'otu_table'
normalize(object, method = "TSS", ...)
## S4 method for signature 'data.frame'
normalize(object, method = "TSS", ...)
## S4 method for signature 'matrix'
normalize(object, method = "TSS", ...)
norm_rarefy(
object,
size = min(sample_sums(object)),
rng_seed = FALSE,
replace = TRUE,
trim_otus = TRUE,
verbose = TRUE
)
norm_tss(object)
norm_css(object, sl = 1000)
norm_rle(
object,
locfunc = stats::median,
type = c("poscounts", "ratio"),
geo_means = NULL,
control_genes = NULL
)
norm_tmm(
object,
ref_column = NULL,
logratio_trim = 0.3,
sum_trim = 0.05,
do_weighting = TRUE,
Acutoff = -1e+10
)
norm_clr(object)
norm_cpm(object)
object |
a phyloseq::phyloseq or phyloseq::otu_table |
method |
the methods used to normalize the microbial abundance data. Options includes:
|
... |
other arguments passed to the corresponding normalization methods. |
size , rng_seed , replace , trim_otus , verbose |
extra arguments passed to
|
sl |
The value to scale. |
locfunc |
a function to compute a location for a sample. By default, the median is used. |
type |
method for estimation: either "ratio"or "poscounts" (recommend). |
geo_means |
default |
control_genes |
default |
ref_column |
column to use as reference |
logratio_trim |
amount of trim to use on log-ratios |
sum_trim |
amount of trim to use on the combined absolute levels ("A" values) |
do_weighting |
whether to compute the weights or not |
Acutoff |
cutoff on "A" values to use before trimming |
the same class with object
.
Created by Yang Cao
edgeR::calcNormFactors()
,DESeq2::estimateSizeFactorsForMatrix()
,
metagenomeSeq::cumNorm()
phyloseq::rarefy_even_depth()
metagenomeSeq::calcNormFactors()
DESeq2::estimateSizeFactorsForMatrix()
edgeR::calcNormFactors()
data(caporaso)
normalize(caporaso, "TSS")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.