Home

/

GitHub

/

lcpm: Transform Counts
In dswatson/biomisc: Convenient wrappers for functions in bioinformatics analysis pipelines

Description Usage Arguments Details Value References See Also Examples

This convenient wrapper function converts raw counts to the log2-counts per million scale, following normalization for library size and a minimal count shift.

1	lcpm(mat, filter = NULL, method = "TMM")

`mat`	Probe by sample matrix of raw counts.
`filter`	Optional vector of length 2 specifying the filter criterion. Each probe must have at least `filter[1]` log2-counts per million in at least `filter[2]` libraries to pass the expression threshold.
`method`	Normalization method to be used. See Details.

lcpm applies the voom transformation to sequencing data, converting counts to approximately normal distributions on the log2-CPM scale. Data can now be modelled using traditional linear techniques (at least once spot weights have been applied; see Law, et al. (2014)), or used for unsupervised clustering analysis via PCA, MDS, or other methods.

It is recommended that low count genes be filtered out prior to transformation. There is no general algorithm for determining the most appropriate expression filter for a given data set. As a rule of thumb, the limma authors advise setting filter[1] to either 1, or 10 / (L / 1,000,000), where L = the minimum library size for a given count matrix. The former corresponds to a log2-CPM of 0, while the latter may be preferable in cases where read depth is especially shallow. For filter[2], the authors recommend using the number of replicates in the largest group, to guarantee that a gene is expresed in at least one sample for any groupwise comparison. These are broad guidelines, however, not strict rules.

method = "TMM" is the weighted trimmed mean of M-values (to the reference) proposed by Robinson & Oshlack (2010), where the weights are from the delta method on binomial data.

method = "RLE" is the scaling factor method proposed by Anders & Huber (2010). We call it "relative log expression", as median library is calculated from the geometric mean of all columns and the median ratio of each sample to the median library is taken as the scale factor.

method = "upperquartile" is the upper-quartile normalization method of Bullard, et al. (2010), in which the scale factors are calculated from the 75 quantile of the counts for each library, after removing genes which are zero in all libraries.

If method = "none", then the normalization factors are set to 1.

For symmetry, normalization factors are adjusted to multiply to 1. The effective library size is then the original library size multiplied by the scaling factor.

Note that rows that have zero counts for all columns are trimmed before normalization factors are computed. Therefore rows with all zero counts do not affect the estimated factors.

A numeric matrix of normalized counts on the log2-CPM scale.

Law, C.W., Chen, Y., Shi, W., & Smyth, G.K. (2014). "voom: precision weights unlock linear model analysis tools for RNA-seq read counts." Genome Biology, 15:R29. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29

Anders, S. & Huber, W. (2010). "Differential expression analysis for sequence count data." Genome Biology, 11:R106. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-10-r106

Bullard, J.H., Purdom, E., Hansen, K.D. & Dudoit, S. (2010). "Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments." BMC Bioinformatics, 11:94. http://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-94

Robinson, M.D. & Oshlack, A. (2010). "A scaling normalization method for differential expression analysis of RNA-seq data." Genome Biology, 11:R25. https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-3-r25

voom, cpm

# Simulate count data
mat <- matrix(rnbinom(5000, mu = 4, size = 1), nrow = 1000, ncol = 5)

# Plot raw counts
library(bioplotr)
plot_density(mat)

# Plot transformed counts
trans_mat <- lcpm(mat)
plot_density(trans_mat)

dswatson/biomisc documentation built on May 15, 2019, 4:52 p.m.

dswatson/biomisc index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dswatson/biomisc
Convenient wrappers for functions in bioinformatics analysis pipelines

lcpm: Transform Counts
In dswatson/biomisc: Convenient wrappers for functions in bioinformatics analysis pipelines

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to lcpm in dswatson/biomisc...

R Package Documentation

Browse R Packages

We want your feedback!

dswatson/biomisc Convenient wrappers for functions in bioinformatics analysis pipelines

lcpm: Transform Counts In dswatson/biomisc: Convenient wrappers for functions in bioinformatics analysis pipelines

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to lcpm in dswatson/biomisc...

R Package Documentation

Browse R Packages

We want your feedback!

dswatson/biomisc
Convenient wrappers for functions in bioinformatics analysis pipelines

lcpm: Transform Counts
In dswatson/biomisc: Convenient wrappers for functions in bioinformatics analysis pipelines