voom: Transform RNA-Seq Data Ready for Linear Modelling
In richierocks/limma2: Linear Models for Microarray Data

Description Usage Arguments Details Value Author(s) References See Also

Transform count data to log2-counts per million (logCPM), estimate the mean-variance relationship and use this to compute appropriate observational-level weights. The data are then ready for linear modelling.

1	voom(counts, design = NULL, lib.size = NULL, normalize.method = "none", plot = FALSE, span=0.5, ...)

`counts`	a numeric `matrix` containing raw counts, or an `ExpressionSet` containing raw counts, or a `DGEList` object.
`design`	design matrix with rows corresponding to samples and columns to coefficients to be estimated. Defaults to the unit vector meaning that samples are treated as replicates.
`lib.size`	numeric vector containing total library sizes for each sample. If `NULL` and `counts` is a `DGEList` then, the normalized library sizes are taken from `counts`. Otherwise library sizes are calculated from the columnwise counts totals.
`normalize.method`	normalization method to be applied to the logCPM values. Choices are as for the `method` argument of `normalizeBetweenArrays` when the data is single-channel.
`plot`	`logical`, should a plot of the mean-variance trend be displayed?
`span`	width of the lowess smoothing window as a proportion.
`...`	other arguments are passed to `lmFit`.

This function is intended to process RNA-Seq or ChIP-Seq data prior to linear modelling in limma.

voom is an acronym for mean-variance modelling at the observational level. The key concern is to estimate the mean-variance relationship in the data, then use this to compute appropriate weights for each observation. Count data almost show non-trivial mean-variance relationships. Raw counts show increasing variance with increasing count size, while log-counts typically show a decreasing mean-variance trend. This function estimates the mean-variance trend for log-counts, then assigns a weight to each observation based on its predicted variance. The weights are then used in the linear modelling process to adjust for heteroscedasticity.

In an experiment, a count value is observed for each tag in each sample. A tag-wise mean-variance trend is computed using lowess. The tag-wise mean is the mean log2 count with an offset of 0.5, across samples for a given tag. The tag-wise variance is the quarter-root-variance of normalized log2 counts per million values with an offset of 0.5, across samples for a given tag. Tags with zero counts across all samples are not included in the lowess fit. Optional normalization is performed using normalizeBetweenArrays. Using fitted values of log2 counts from a linear model fit by lmFit, variances from the mean-variance trend were interpolated for each observation. This was carried out by approxfun. Inverse variance weights can be used to correct for mean-variance trend in the count data.

An EList object with the following components:

`E`	numeric matrix of normalized expression values on the log2 scale
`weights`	numeric matrix of inverse variance weights
`design`	design matrix
`lib.size`	numeric vector of total normalized library sizes
`genes`	dataframe of gene annotation extracted from `counts`

Charity Law and Gordon Smyth

Law, CW (2013). Precision weights for gene expression analysis. PhD Thesis. University of Melbourne, Australia. http://repository.unimelb.edu.au/10187/17598

Law, CW, Chen, Y, Shi, W, Smyth, GK (2014). Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology (Accepted 9 January 2014). http://www.statsci.org/smyth/pubs/VoomPreprint.pdf

A voom case study is given in the User's Guide.

vooma is a similar function but for microarrays instead of RNA-seq.

richierocks/limma2 documentation built on May 27, 2019, 8:47 a.m.