README.md

Objective Prediction of Confounders

batchPred is an R package that helps identify the most relevant set of covariates to correct for when you have many variables that could potentially confound your data, e.g known variables such as age, sex, read quality, or unknown variables such as principle components. The package requires an expression dataset, at least 1 dataframe of covariates, and a reference gene-gene association network with a confidence score. In our examples, we use 'full network' gold standards from the GIANT database.

Installation

You can install the most recent development version from github using devtools:

# install.packages("devtools")
devtools::install_github("NabilaRahman/batchPred")

Quick Start

Step 1: Prepare your reference network using refNet function.

Step 2: Choose similar number of gene interactions with highest confidence as True Positives and lowest confidence as True Negatives.

Step 3: Compute the most relevant set of covariates to correct for these subset(s) using batchPred function. It is recommended to run batchPred using at least 3 distinct reference subsets and add covariates to your model if they show at least 0.1% improvement in AUC score in each run.

Step 4: Compute variables for ROC curve for visualising batch effect evaluation using the computeBceF2 function. Then pipe the results into plotBceF2 to get your plot.

References

Somekh, J., Shen-Orr, S.S. & Kohane, I.S. Batch correction evaluation framework using a-priori gene-gene associations: applied to the GTEx dataset. BMC Bioinformatics 20, 268 (2019). https://doi.org/10.1186/s12859-019-2855-9



NabilaRahman/batchPred documentation built on June 19, 2022, 5:35 a.m.