Remove unwanted variation when testing for differential methylation

Description

Provides an interface similar to lmFit from limma to the RUV2, RUV4, RUVinv and RUVrinv functions from the ruv package, which facilitates the removal of unwanted variation in a differential methylation analysis. A set of negative control variables, as described in the references, must be specified.

Usage

1
2
RUVfit(data, design, coef, ctl, method=c("inv", "rinv", "ruv4", "ruv2"), 
k = NULL, ...)

Arguments

data

numeric matrix with rows corresponding to the features of interest such as CpG sites and columns corresponding to samples or arrays.

design

the design matrix of the experiment, with rows corresponding to arrays/samples and columns to coefficients to be estimated.

coef

integer, column of the design matrix containing the comparison to test for differential methylation. Default is the last colum of the design matrix.

ctl

logical vector, length == nrow(data). Features that are to be used as negative control variables are indicated as TRUE, all other features are FALSE.

method

character string, indicates which RUV method should be used. Default method is RUVinv.

k

integer, required if method is "ruv2" or "ruv4". Indicates the number of unwanted factors to use. Can be 0.

...

additional arguments that can be passed to RUV2, RUV4, RUVinv and RUVrinv. See linked function documentation for details.

Details

This function depends on the ruv and limma packages and is used to estimate and adjust for unwanted variation in a differential methylation analysis. Briefly, the unwanted factors W are estimated using negative control variables. Y is then regressed on the variables X, Z, and W. For methylation data, the analysis is performed on the M-values, defined as the log base 2 ratio of the methylated signal to the unmethylated signal.

Value

An object of class MArrayLM (see MArrayLM-class) containing:

coefficients

The estimated coefficients of the factor(s) of interest.

sigma2

Estimates of the features' variances.

t

t statistics for the factor(s) of interest.

p

P-values for the factor(s) of interest.

multiplier

The constant by which sigma2 must be multiplied in order to get an estimate of the variance of coefficients

df

The number of residual degrees of freedom.

W

The estimated unwanted factors.

alpha

The estimated coefficients of W.

byx

The coefficients in a regression of Y on X (after both Y and X have been "adjusted" for Z). Useful for projection plots.

bwx

The coefficients in a regression of W on X (after X has been "adjusted" for Z). Useful for projection plots.

X

X. Included for reference.

k

k. Included for reference.

ctl

ctl. Included for reference.

Z

Z. Included for reference.

fullW0

Can be used to speed up future calls of RUVfit.

Author(s)

Jovana Maksimovic jovana.maksimovic@mcri.edu.au

References

Gagnon-Bartsch JA, Speed TP. (2012). Using control genes to correct for unwanted variation in microarray data. Biostatistics. 13(3), 539-52. Available at: http://biostatistics.oxfordjournals.org/content/13/3/539.full.

Gagnon-Bartsch, Jacob, and Speed. 2013. Removing Unwanted Variation from High Dimensional Data with Negative Controls. Available at: http://statistics.berkeley.edu/tech-reports/820.

See Also

RUV2, RUV4, RUVinv, RUVrinv, topRUV

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
if(require(minfi) & require(minfiData) & require(limma)) {

# Get methylation data for a 2 group comparison
meth <- getMeth(MsetEx)
unmeth <- getUnmeth(MsetEx)
Mval <- log2((meth + 100)/(unmeth + 100))

group<-factor(pData(MsetEx)$Sample_Group)
design<-model.matrix(~group)

# Perform initial analysis to empirically identify negative control features 
# when not known a priori
lFit = lmFit(Mval,design)
lFit2 = eBayes(lFit)
lTop = topTable(lFit2,coef=2,num=Inf)

# The negative control features should *not* be associated with factor of interest
# but *should* be affected by unwanted variation 
ctl = rownames(Mval) %in% rownames(lTop[lTop$adj.P.Val > 0.5,])

# Perform RUV adjustment and fit
fit = RUVfit(data=Mval, design=design, coef=2, ctl=ctl)
fit2 = RUVadj(fit)

# Look at table of top results
top = topRUV(fit2)
}