batchPred is an R package that helps identify the most relevant set of covariates to correct for when you have many variables that could potentially confound your data, e.g known variables such as age, sex, read quality, or unknown variables such as principle components. The package requires an expression dataset, at least 1 dataframe of covariates, and a reference gene-gene association network with a confidence score. In our examples, we use 'full network' gold standards from the GIANT database.
You can install the most recent development version from github using devtools:
# install.packages("devtools")
devtools::install_github("NabilaRahman/batchPred")
Step 1: Prepare your reference network using refNet function.
Step 2: Choose similar number of gene interactions with highest confidence as True Positives and lowest confidence as True Negatives.
Step 3: Compute the most relevant set of covariates to correct for these subset(s) using batchPred function. It is recommended to run batchPred using at least 3 distinct reference subsets and add covariates to your model if they show at least 0.1% improvement in AUC score in each run.
Step 4: Compute variables for ROC curve for visualising batch effect evaluation using the computeBceF2 function. Then pipe the results into plotBceF2 to get your plot.
Somekh, J., Shen-Orr, S.S. & Kohane, I.S. Batch correction evaluation framework using a-priori gene-gene associations: applied to the GTEx dataset. BMC Bioinformatics 20, 268 (2019). https://doi.org/10.1186/s12859-019-2855-9
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.