Let
Y = XB + ZA + E,
for
Not accounting for the hidden covariates, Z, can reduce power and result in poor control of false discovery rate. This package provides a suite of functions to adjust for hidden confounders, both when one has and does not have access to control genes.
The functions mouthwash()
and backwash()
can adjust for hidden
confounding when one does not have access to control genes. They do so
via non-parametric empirical Bayes methods that use the powerful
methodology of Adaptive Shrinkage (Stephens 2016) within the
factor-augmented regression framework described in Wang et al. (2017).
backwash()
is a slightly more Bayesian version of mouthwash()
. These
methods are described in Gerard and Stephens (2020).
When one has control genes, there are many approaches to take. Such methods include RUV2 (J. A. Gagnon-Bartsch and Speed 2012), RUV4 (J. Gagnon-Bartsch, Jacob, and Speed 2013), and CATE (Wang et al. 2017). This package adds to the field of confounder adjustment with control genes by
vruv4()
.ruv3()
.ruvimpute()
ruvb()
.These additions are described in detail in Gerard and Stephens (2021).
See also the related R packages
cate
(Wang and Zhao 2015)
and ruv
(J. Gagnon-Bartsch
2015).
Check out NEWS.md to see what’s new with each version.
If you use any of the control-gene based methods, please cite:
Gerard, D., & Stephens, M. 2021. “Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls.” Statistica Sinica, 31(3), 1145-1166 <doi:10.5705/ss.202018.0345>.
Or, using BibTex:
@article{gerard2021unifying,
title={Unifying and Generalizing Methods for Removing Unwanted Variation Based on Negative Controls},
author={Gerard, David and Stephens, Matthew},
journal={Statistica Sinica},
doi={10.5705/ss.202018.0345},
volume={31},
number={3},
pages={1145--1166},
year={2021}
}
If you use either MOUTHWASH or BACKWASH, please cite:
Gerard, D., & Stephens, M. 2020. “Empirical Bayes shrinkage and false discovery rate estimation, allowing for unwanted variation,” Biostatistics, 21(1), 15-32 <doi:10.1093/biostatistics/kxy029>.
Or, using BibTex:
@article{gerard2020empirical,
author = {Gerard, David and Stephens, Matthew},
title = {Empirical {B}ayes shrinkage and false discovery rate estimation, allowing for unwanted variation},
journal = {Biostatistics},
volume = {21},
number = {1},
pages = {15--32},
year = {2020},
issn = {1465-4644},
doi = {10.1093/biostatistics/kxy029},
}
To install, first install sva
and limma
from Bioconductor in R:
install.packages("BiocManager")
BiocManager::install(c("limma", "sva"))
Then run in R:
install.packages("devtools")
devtools::install_github("dcgerard/vicar")
If you want some of the tools in vicar
to be exactly equivalent to
those in ruv
, you’ll need to install an older version of ruv
(ruv
was updated and now the those equivalencies are not exactly the same)
devtools::install_version("ruv", version = "0.9.6", repos = "http://cran.us.r-project.org")
A note about matrix computations in vicar: Some of the methods in the vicar package such as mouthwash and backwash rely heavily on matrix-vector operations. The speed of these operations can have a big impact on vicar’s performance, especially in large-scale data sets. If you are applying vicar to large data sets, I recommend that you set up R with optimized BLAS (optionally, LAPACK) libraries, especially if you have a multicore computer (most modern laptops and desktops are multicore). See here and here for advice and technical details on this. For example, in our experiments on a high-performance compute cluster we set up R with multithreaded OpenBLAS.
I’ve provided three vignettes to help you get started with vicar. By
default, the vignettes are not built when you use install_github
. To
build the vignettes during installation, run
install.packages("devtools")
devtools::install_github("dcgerard/vicar", build_vignettes = TRUE)
Note that this will result in a somewhat slower install. The first
vignette, sample_analysis, gives a sample analysis using vicar to
account for hidden confounding. The second vignette, customFA, gives a
few instructions on how to incorporate user-defined factor analyses with
the confounder adjustment procedures implemented in vicar. The third
vignette, custom_prior, gives instructions and examples on
incorporating a user-specified prior into ruvb
. To see these vignettes
after install, run
utils::vignette("sample_analysis", package = "vicar")
utils::vignette("customFA", package = "vicar")
utils::vignette("custom_prior", package = "vicar")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.