View source: R/empiricalBayesLM.R
empiricalBayesLM  R Documentation 
This functions removes variation in highdimensional data due to unwanted covariates while preserving variation due to retained covariates. To prevent numerical instability, it uses Empirical bayesmoderated linear regression, optionally in a robust (outlierresistant) form.
empiricalBayesLM( data, removedCovariates, retainedCovariates = NULL, initialFitFunction = NULL, initialFitOptions = NULL, initialFitRequiresFormula = NULL, initialFit.returnWeightName = NULL, fitToSamples = NULL, weights = NULL, automaticWeights = c("none", "bicov"), aw.maxPOutliers = 0.1, weightType = c("apriori", "empirical"), stopOnSmallWeights = TRUE, minDesignDeviation = 1e10, robustPriors = FALSE, tol = 1e4, maxIterations = 1000, garbageCollectInterval = 50000, scaleMeanToSamples = fitToSamples, scaleMeanOfSamples = NULL, getOLSAdjustedData = TRUE, getResiduals = TRUE, getFittedValues = TRUE, getWeights = TRUE, getEBadjustedData = TRUE, verbose = 0, indent = 0)
data 
A 2dimensional matrix or data frame of numeric data to be adjusted. Variables (for example, genes or methylation profiles) should be in columns and observations (samples) should be in rows. 
removedCovariates 
A vector or twodimensional object (matrix or data frame) giving the covariates whose effect on the data is to be removed. At least one such covariate must be given. 
retainedCovariates 
A vector or twodimensional object (matrix or data frame) giving the covariates whose effect on the data is
to be retained. May be 
initialFitFunction 
Function name to perform the initial fit. The default is to use the internal implementation of linear model
fitting. The function must take arguments 
initialFitOptions 
Optional specifications of extra arguments for 
initialFitRequiresFormula 
Logical: does the initial fit function need 
initialFit.returnWeightName 
Name of the component of the return value of 
fitToSamples 
Optional index of samples from which the linear model fits should be calculated. Defaults to all samples. If given, the models will be only fit to the specified samples but all samples will be transformed using the calculated coefficients. 
weights 
Optional 2dimensional matrix or data frame of the same dimensions as 
automaticWeights 
One of (unique abrreviations of) 
aw.maxPOutliers 
If 
weightType 
One of (unique abbreviations of) 
stopOnSmallWeights 
Logical: should presence of small 
minDesignDeviation 
Minimum standard deviation for columns of the design matrix to be retained. Columns with standard deviations below this number will be removed (effectively removing the corresponding terms from the design). 
robustPriors 
Logical: should robust priors be used? This essentially means replacing mean by median and covariance by biweight midcovariance. 
tol 
Convergence criterion used in the numerical equation solver. When the relative change in coefficients falls below this threshold, the system will be considered to have converged. 
maxIterations 
Maximum number of iterations to use. 
garbageCollectInterval 
Number of variables after which to call garbage collection. 
scaleMeanToSamples 
Optional specification of samples (given as a vector of indices) to whose means the resulting adjusted data should be scaled (more precisely, shifted). 
scaleMeanOfSamples 
Optional specification of samples (given as a vector of indices) that will be used in calculating the shift. Specifically,
the shift is such that the mean of samples given in 
getOLSAdjustedData 
Logical: should data adjusted by ordinary least squares or by

getResiduals 
Logical: should the residuals (adjusted values without the means) be returned? 
getFittedValues 
Logical: should fitted values be returned? 
getWeights 
Logical: should the final weights be returned? 
getEBadjustedData 
Logical: should the EB step be performed and the adjusted data returned? If this
is 
verbose 
Level of verbosity. Zero means silent, higher values result in more diagnostic messages being printed. 
indent 
Indentation of diagnostic messages. Each unit adds two spaces. 
This function uses Empirical Bayesmoderated (EB) linear regression to remove variation in data
due to the
variables in removedCovariates
while retaining variation due to variables in retainedCovariates
,
if any are given. The EB step uses simple normal priors on the regression coefficients and inverse gamma
priors on the
variances. The procedure starts with multivariate ordinary linear regression of individual columns in
data
on retainedCovariates
and removedCovariates
. Alternatively, the user may specify an
intial fit function (e.g., robust linear regression). To make the coefficients comparable,
columns of data
are scaled to (weighted if weights are given) mean 0 and variance 1.
The resulting regression coefficients are used to
determine the parameters of the normal prior (mean, covariance, and inverse gamma or median and biweight
midcovariance if robust priors are used), and the variances are used to determine the parameters of the
inverse gamma prior. The EB step then essentially shrinks the coefficients toward their means, with the amount
of shrinkage determined by the prior covariance.
Using appropriate weights can make the data adjustment robust to outliers. This can be achieved automatically
by using the argument automaticWeights = "bicov"
. When bicov weights are used, we also recommend
setting the argument maxPOutliers
to a maximum proportion of samples that could be outliers. This is
especially important if some of the design variables are binary and can be expected to have a strong effect on
some of the columns in data
, since standard biweight midcorrelation (and its weights) do not work well
on bimodal data.
The automatic bicov weights are determined from data
only. It is implicitly assumed that there are no
outliers in the retained and removed covariates. Outliers in the covariates are more difficult to work with
since, even if the regression is made robust to them, they can influence the adjusted values for the sample in
which they appear. Unless the the covariate outliers can be attributed to a relevant variation in experimental
conditions, samples with covariate outliers are best removed entirely before calling this function.
A list with the following components (some of which may be missing depending on input options):
adjustedData 
A matrix of the same dimensions as the input 
residuals 
A matrix of the same dimensions as the input 
coefficients 
A matrix of regression coefficients. Rows correspond to the design matrix variables
(mean, retained and removed covariates) and columns correspond to the variables (columns) in 
coefficiens.scaled 
A matrix of regression coefficients corresponding to columns in 
sigmaSq 
Estimated error variances (one for each column of input 
sigmaSq.scaled 
Estimated error variances corresponding to columns in 
fittedValues 
Fitted values calculated from the means and coefficients corresponding to the removed covariates, i.e., roughly the values that are subtracted out of the data. 
adjustedData.OLS 
A matrix of the same dimensions as the input 
residuals.OLS 
A matrix of the same dimensions as the input 
coefficients.OLS 
A matrix of ordinary least squares regression coefficients.
Rows correspond to the design matrix variables
(mean, retained and removed covariates) and columns correspond to the variables (columns) in 
coefficiens.OLS.scaled 
A matrix of ordinary least squares regression coefficients corresponding to columns
in 
sigmaSq.OLS 
Estimated OLS error variances (one for each column of input 
sigmaSq.OLS.scaled 
Estimated OLS error variances corresponding to columns in 
fittedValues.OLS 
OLS fitted values calculated from the means and coefficients corresponding to the removed covariates. 
weights 
A matrix of weights used in the regression models. The matrix has the same dimension as the
input 
dataColumnValid 
Logical vector with one element per column of input 
dataColumnWithZeroVariance 
Logical vector with one element per column of input 
coefficientValid 
Logical matrix of the dimension (number of covariates +1) times (number of
variables in 
Peter Langfelder
bicovWeights
for suitable weights that make the adjustment robust to outliers.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.