vruv2_old: Old version of Calibrated RUV2 that doesn't really work.

Description Usage Arguments Details Author(s) References

View source: R/ruv2.R

Description

This function will perform a variant of Removing Unwanted Variation 2-step (RUV2) (Gagnon-Bartsch et al, 2013), where we include a variance inflation parameter in the second step and estimate it by maximum likelihood.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
vruv2_old(
  Y,
  X,
  ctl,
  k = NULL,
  cov_of_interest = ncol(X),
  likelihood = c("t", "normal"),
  limmashrink = TRUE,
  degrees_freedom = NULL,
  include_intercept = TRUE,
  gls = TRUE,
  fa_func = pca_naive,
  fa_args = list()
)

Arguments

Y

A matrix of numerics. These are the response variables where each column has its own variance. In a gene expression study, the rows are the individuals and the columns are the genes.

X

A matrix of numerics. The covariates of interest.

ctl

A vector of logicals of length ncol(Y). If position i is TRUE then position i is considered a negative control.

k

A non-negative integer.The number of unobserved confounders. If not specified and the R package sva is installed, then this function will estimate the number of hidden confounders using the methods of Buja and Eyuboglu (1992).

cov_of_interest

A vector of positive integers. The column numbers of the covariates in X whose coefficients you are interested in. The rest are considered nuisance parameters and are regressed out by OLS.

likelihood

Either "normal" or "t". If likelihood = "t", then the user may provide the degrees of freedom via degrees_freedom.

limmashrink

A logical. Should we apply hierarchical shrinkage to the variances (TRUE) or not (FALSE)? If degrees_freedom = NULL and limmashrink = TRUE and likelihood = "t", then we'll also use the limma returned degrees of freedom.

degrees_freedom

if likelihood = "t", then this is the user-defined degrees of freedom for that distribution. If degrees_freedom is NULL then the degrees of freedom will be the sample size minus the number of covariates minus k.

include_intercept

A logical. If TRUE, then it will check X to see if it has an intercept term. If not, then it will add an intercept term. If FALSE, then X will be unchanged.

gls

A logical. Should we estimate the part of the confounders associated with the nuisance parameters with gls (TRUE) or with ols (FALSE).

fa_func

A factor analysis function. The function must have as inputs a numeric matrix Y and a rank (numeric scalar) r. It must output numeric matrices alpha and Z and a numeric vector sig_diag. alpha is the estimate of the coefficients of the unobserved confounders, so it must be an r by ncol(Y) matrix. Z must be an r by nrow(Y) matrix. sig_diag is the estimate of the column-wise variances so it must be of length ncol(Y). The default is the function pca_naive that just uses the first r singular vectors as the estimate of alpha. The estimated variances are just the column-wise mean square.

fa_args

A list. Additional arguments you want to pass to fa_func.

Details

See vruv4 for a description of the model.

The variances of the non-control genes using the adjustment for RUV2 are unchanged. The variances for the control genes are the ones that are inflated. Hence, for this method to adjust the variances of the genes of interest (the non-control genes) you must use hierarchical shrinkage of the variances. That is, you must set limmashrink = TRUE.

Author(s)

David Gerard

References


dcgerard/vicar documentation built on July 7, 2021, 1:08 p.m.