knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
DNA methylation has been shown to be in important regulator of gene expression.
This relationship is complex, with multiple CpG sites in a region influencing expression rather than by a single CpG site in isolation. The EpiXprS
package provides a means for constructing these multi-CpG models, allowing for the further complexity of context dependent modifiers of the relationship between methylation and gene expression. In particular, EpiXprS
was developed to quantify and account for the context dependent effect of race through the inclusion of CpG-by-race interaction terms. Namely, to construct the original models, paired DNA methylation and gene expression were downloaded from The Cancer Genome Atlas TCGA, using the TCGAbiolinks
package, to assess the role of race as a modifier in the multivariate relationship between DNA methylation and gene expression. Eight tumor types were downloaded that had high African American representation to allow us to assess these potential differences. While the subsequent description therefore focuses on race (African American and European American) the context being evaluated, the contribution of any potential dichotomous modifier (e.g. environmental exposure) could be assessed.
To install this package, start R
and enter:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("EpiXprS")
This will install the package and all missing dependencies.
library(EpiXprS) library(SummarizedExperiment) set.seed(1014) data('Testdat',package = 'EpiXprS') Methy <- assays(Testdat)$Methyl Methy[1:5,1:5] Counts <- assays(Testdat)$RNASeqGene Counts[1:2,1:5] coldat <- colData(Testdat) coldat$race <- ifelse(coldat$race == 'white',1,0) head(coldat)
The main functionality of this package is to allow one to associate many CpG sites with gene expression, while accounting for possible race-specific effects. This is managed by including multiplicative interaction terms between race and CpG sites in a penalized regression model. The function Construct()
has three methods, Specific
which constructs models that ignore possible race-specific effects, Adjusted
where race is included in the model, and Interaction
where the aforementioned multiplicative interaction terms are included as well. The function returns 1) the prediction model, 2) the cross-validated predicted expression, 3) the percent of variability explained, 4) & 5) the race specific variability
explained for the Adjusted
and Interaction
model types, and 6) ensembl annotation for the genes modeled.
out <- Construct(Object = Testdat, method ='Interaction', dist = NULL, nfolds = NULL, impute = TRUE, beta = TRUE, parallel = FALSE, Methylation.Annotation = annotation_450k, Gene.Annotation = annotated_RNA) rbind(head(out$Model),tail(out$Model)) head(out$Predicted.Expression) out$Overall.D2 out$Class1.D2 out$Class2.D2 out$Annotation
Another useful function, ModelSearch()
, allows one to search through pre-constructed models to identify gene models in which a given CpG site is included. It also provides functionality to search for specific genes to identify the CpG sites that are included its model.
library(ExperimentHub) eh = ExperimentHub() x <- query(eh, 'EpiXprS') Cancer <- x[["EH6022"]] ModelSearch('cg21837192', Cancer = Cancer)
Similarly, pre-constructed models can be used to predict gene expression from a 450K DNA methylation data set, using the Predict()
function. Prior to use, we advise missing DNA methylation be imputed using the Imputation()
function. This function is borrowed from the methyLImp
package, to provide complete predicted expression. For more details, see ?Predict
.
Finally, predicted expression scores from the Predict()
function can be used to associate predicted expression with a phenotype of interest using the Assoc
function. This function currently implements linear regression, logistic regression, and Cox proportional hazards models. For more details, see ?Assoc
.
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.