Description Usage Arguments Details Value References See Also Examples
This convenient wrapper function converts raw counts to the log2-counts per million scale, following normalization for library size and a minimal count shift.
1 |
mat |
Probe by sample matrix of raw counts. |
filter |
Optional vector of length 2 specifying the filter criterion. Each
probe must have at least |
method |
Normalization method to be used. See Details. |
lcpm
applies the voom
transformation to sequencing
data, converting counts to approximately normal distributions on the log2-CPM
scale. Data can now be modelled using traditional linear techniques (at least
once spot weights have been applied; see Law, et al. (2014)), or used for
unsupervised clustering analysis via PCA, MDS, or other methods.
It is recommended that low count genes be filtered out prior to transformation.
There is no general algorithm for determining the most appropriate expression
filter for a given data set. As a rule of thumb, the limma
authors advise
setting filter[1]
to either 1, or 10 / (L / 1,000,000), where
L = the minimum library size for a given count matrix. The former
corresponds to a log2-CPM of 0, while the latter may be preferable in cases where
read depth is especially shallow. For filter[2]
, the authors recommend using
the number of replicates in the largest group, to guarantee that a gene is expresed
in at least one sample for any groupwise comparison. These are broad guidelines,
however, not strict rules.
method = "TMM"
is the weighted trimmed mean of M-values (to the reference)
proposed by Robinson & Oshlack (2010), where the weights are from the delta method
on binomial data.
method = "RLE"
is the scaling factor method proposed by Anders & Huber
(2010). We call it "relative log expression", as median library is calculated from
the geometric mean of all columns and the median ratio of each sample to the median
library is taken as the scale factor.
method = "upperquartile"
is the upper-quartile normalization method of
Bullard, et al. (2010), in which the scale factors are calculated from the 75
quantile of the counts for each library, after removing genes which are zero in all
libraries.
If method = "none"
, then the normalization factors are set to 1.
For symmetry, normalization factors are adjusted to multiply to 1. The effective library size is then the original library size multiplied by the scaling factor.
Note that rows that have zero counts for all columns are trimmed before normalization factors are computed. Therefore rows with all zero counts do not affect the estimated factors.
A numeric matrix of normalized counts on the log2-CPM scale.
Law, C.W., Chen, Y., Shi, W., & Smyth, G.K. (2014). "voom: precision weights unlock linear model analysis tools for RNA-seq read counts." Genome Biology, 15:R29. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29
Anders, S. & Huber, W. (2010). "Differential expression analysis for sequence count data." Genome Biology, 11:R106. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-10-r106
Bullard, J.H., Purdom, E., Hansen, K.D. & Dudoit, S. (2010). "Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments." BMC Bioinformatics, 11:94. http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-94
Robinson, M.D. & Oshlack, A. (2010). "A scaling normalization method for differential expression analysis of RNA-seq data." Genome Biology, 11:R25. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-3-r25
1 2 3 4 5 6 7 8 9 10 | |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.