Description Usage Arguments Details Value Author(s) References
This function will perform a variant of Removing Unwanted Variation 4-step (RUV4) (Gagnon-Bartsch et al, 2013) where the control genes are used not only to estimate the hidden confounders, but to estimate a variance inflation parameter. This variance inflation step is akin to the "empirical null" approach of Efron (2004). After this procedure, Adaptive SHrinkage (ASH) (Stephens, 2016) is performed on the coefficient estimates and the inflated standard errors.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Y |
A matrix of numerics. These are the response variables where each column has its own variance. In a gene expression study, the rows are the individuals and the columns are the genes. |
X |
A matrix of numerics. The covariates of interest. |
ctl |
A vector of logicals of length |
k |
A non-negative integer.The number of unobserved confounders. If not specified and the R package sva is installed, then this function will estimate the number of hidden confounders using the methods of Buja and Eyuboglu (1992). |
cov_of_interest |
A positive integer. The column number of the
covariate in X whose coefficients you are interested in. The
rest are considered nuisance parameters and are regressed out
by OLS. |
likelihood |
Either |
ash_args |
A list of arguments to pass to ash. See
|
limmashrink |
A logical. Should we apply hierarchical
shrinkage to the variances ( |
degrees_freedom |
if |
include_intercept |
A logical. If |
gls |
A logical. Should we use generalized least squares
( |
fa_func |
A factor analysis function. The function must have
as inputs a numeric matrix |
fa_args |
A list. Additional arguments you want to pass to fa_func. |
scale_var |
A logical. Should we use the variance inflation
parameter in the estimate standard errors when inserting into
|
The model is
Y = XB + ZA + E,
where Y is a matrix of responses (e.g. log-transformed gene expression levels), X is a matrix of covariates, B is a matrix of coefficients, Z is a matrix of unobserved confounders, A is a matrix of unobserved coefficients of the unobserved confounders, and E is the noise matrix where the elements are independent Gaussian and each column shares a common variance. The rows of Y are the observations (e.g. individuals) and the columns of Y are the response variables (e.g. genes).
This model is fit using a two-step approach proposed in Gagnon-Bartsch et al (2013) and described in Wang et al (2015), modified to include estimating a variance inflation parameter. Rather than use OLS in the second step of this two-step procedure, we estimate the coefficients using Adaptive SHrinkage (ASH) (Stephens, 2016). In the current implementation, only the coefficients of one covariate can be estimated using ASH. The rest are regressed out using OLS.
Except for the list ruv4
, the values returned are
the exact same as in ash.workhorse
. See that
function for more details. Elements in the ruv4
are the
exact same as returned in vruv4
.
David Gerard
Buja, A. and Eyuboglu, N., 1992. "Remarks on parallel analysis." Multivariate behavioral research, 27(4), pp.509-540. doi: 10.1207/s15327906mbr2704_2
Efron, B., 2004. "Large-scale simultaneous hypothesis testing: the choice of a null hypothesis." Journal of the American Statistical Association, 99(465), pp.96-104. doi: 10.1198/016214504000000089
Gagnon-Bartsch, J., Laurent Jacob, and Terence P. Speed, 2013. "Removing unwanted variation from high dimensional data with negative controls." Berkeley: Department of Statistics. University of California. https://statistics.berkeley.edu/tech-reports/820
Stephens, Matthew. 2016. "False discovery rates: a new deal." Biostatistics 18 (2): 275–94. doi: 10.1093/biostatistics/kxw041
Wang, Jingshu, Qingyuan Zhao, Trevor Hastie, and Art B. Owen. 2017. "Confounder adjustment in multiple hypothesis testing." The Annals of Statistics 45, no. 5: 1863-1894. doi: 10.1214/16-AOS1511
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.