RUVs-methods | R Documentation |
This function implements the RUVs method of Risso et al. (2014).
RUVs(x, cIdx, k, scIdx, round=TRUE, epsilon=1, tolerance=1e-8, isLog=FALSE)
x |
Either a genes-by-samples numeric matrix or a SeqExpressionSet object containing the read counts. |
cIdx |
A character, logical, or numeric vector indicating the subset of genes to be used as negative controls in the estimation of the factors of unwanted variation. |
k |
The number of factors of unwanted variation to be estimated from the data. |
scIdx |
A numeric matrix specifying the replicate samples for which to compute the count differences used to estimate the factors of unwanted variation (see details). |
round |
If |
epsilon |
A small constant (usually no larger than one) to be added to the counts prior to the log transformation to avoid problems with log(0). |
tolerance |
Tolerance in the selection of the number of positive singular values, i.e., a singular value must be larger than |
isLog |
Set to |
The RUVs procedure performs factor analysis on a matrix of count differences for replicate/negative control samples, for which the biological covariates of interest are constant.
Each row of scIdx
should correspond to a set of replicate
samples. The number of columns is the size of the largest set of
replicates; rows for smaller sets are padded with -1 values.
For example, if the sets of replicate samples are
(1,11,21),(2,3),(4,5),(6,7,8), then scIdx
should be
1 11 21
2 3 -1
4 5 -1
6 7 8
signature(x = "matrix", cIdx = "ANY", k = "numeric", scIdx = "matrix")
It returns a list with
A samples-by-factors matrix with the estimated factors of unwanted variation (W
).
The genes-by-samples matrix of normalized expression measures (possibly
rounded) obtained by removing the factors of unwanted variation from the
original read counts (normalizedCounts
).
signature(x = "SeqExpressionSet", cIdx = "character", k="numeric", scIdx = "matrix")
It returns a SeqExpressionSet with
The normalized counts in the normalizedCounts
slot.
The estimated factors of unwanted variation as additional columns of the
phenoData
slot.
Davide Risso (building on a previous version by Laurent Jacob).
D. Risso, J. Ngai, T. P. Speed, and S. Dudoit. Normalization of RNA-seq data using factor analysis of control genes or samples. Nature Biotechnology, 2014. (In press).
D. Risso, J. Ngai, T. P. Speed, and S. Dudoit. The role of spike-in standards in the normalization of RNA-Seq. In D. Nettleton and S. Datta, editors, Statistical Analysis of Next Generation Sequence Data. Springer, 2014. (In press).
RUVg
, RUVr
.
library(zebrafishRNASeq) data(zfGenes) ## run on a subset of genesfor time reasons ## (real analyses should be performed on all genes) genes <- rownames(zfGenes)[grep("^ENS", rownames(zfGenes))] spikes <- rownames(zfGenes)[grep("^ERCC", rownames(zfGenes))] set.seed(123) idx <- c(sample(genes, 1000), spikes) seq <- newSeqExpressionSet(as.matrix(zfGenes[idx,])) # RUVs normalization controls <- rownames(seq) differences <- matrix(data=c(1:3, 4:6), byrow=TRUE, nrow=2) seqRUVs <- RUVs(seq, controls, k=1, differences) pData(seqRUVs) head(normCounts(seqRUVs))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.