regressCovariates: regress out a covariate from the training data

Description Usage Arguments Value Examples

View source: R/prepareMOFA.R

Description

Function to regress out a covariate from the training data.
If you have technical sources of variability (i.e. batch effects) that you do not want to be captured by factors in the model, you should regress them out before fitting MOFA. This function performs a simple linear regression model, extracts the residuals, and replaces the original data in the TrainingData slot.
Why is this important? If big technical factors exist, the model will "focus" on capturing the variability driven by these factors, and smaller sources of variability could be missed.
But... can we not simply add those covariates to the model? Technically yes, but we extensively tested this functionality and it was not yielding good results.
The reason is that covariates are usually discrete labels that do not reflect the underlying molecular biology. For example, if you introduce age as a covariate, but the actual age is different from the "molecular age", the model will simply learn a new factor that corresponds to this "latent" molecular age, and it will drop the covariate from the model.
We recommend factors to be learnt in a completely unsupervised manner and subsequently relate them to the covariates via visualisation or via a simple correlation analysis (see our vignettes for more details).

Usage

1
regressCovariates(object, views, covariates, min_observations = 5)

Arguments

object

an untrained MOFAmodel

views

the view(s) to regress out the covariates.

covariates

a vector (one covariate) or a data.frame (for multiple covariates) where each row corresponds to one sample, sorted in the same order as in the input data matrices. You can check the order by doing sampleNames(MOFAobject). If required, fill missing values with NA, which will be ignored when fitting the linear model.

min_observations

number of non-missing observations required

Value

Returns an untrained MOFAmodel where the specified covariates have been regressed out in the training data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data("CLL_data", package = "MOFAdata")
data("CLL_covariates", package = "MOFAdata")
library(MultiAssayExperiment)
mae_CLL <- MultiAssayExperiment(
experiments = CLL_data, 
colData = CLL_covariates
)
MOFAobject <- createMOFAobject(mae_CLL)
MOFAobject <- prepareMOFA(MOFAobject)
MOFAobject_reg <- regressCovariates(
object = MOFAobject,
views = c("Drugs","Methylation","mRNA"),
covariates = InputData(MOFAobject)$Gender
)
# MOFA object with training data after regressing out the specified covariate
MOFAobject_reg 

bioFAM/MOFA documentation built on Oct. 3, 2020, 12:53 a.m.