caw: Iterative procedure for confounder correction with a...

Description Usage Arguments Author(s)

View source: R/caw.R

Description

Quoth the Raven "Caw, caw!"

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
caw(
  Y,
  X,
  k = NULL,
  cov_of_interest = ncol(X),
  limmashrink = TRUE,
  weight_func = ash_wrap,
  weight_args = list(),
  fa_func = pca_naive,
  fa_args = list(),
  scale_var = TRUE,
  include_intercept = TRUE,
  weight_init = c("all_null", "random", "limma"),
  weight_func_input = c("summary2"),
  degrees_freedom = NULL,
  min_scale = 0.8
)

Arguments

Y

A matrix of numerics. These are the response variables where each column has its own variance. In a gene expression study, the rows are the individuals and the columns are the genes.

X

A matrix of numerics. The covariates of interest.

k

A non-negative integer.The number of unobserved confounders. If not specified and the R package sva is installed, then this function will estimate the number of hidden confounders using the methods of Buja and Eyuboglu (1992).

cov_of_interest

A vector of positive integers. The column numbers of the covariates in X whose coefficients you are interested in. The rest are considered nuisance parameters and are regressed out by OLS.

limmashrink

A logical. Should we apply hierarchical shrinkage to the variances (TRUE) or not (FALSE)? If degrees_freedom = NULL and limmashrink = TRUE and likelihood = "t", then we'll also use the limma returned degrees of freedom.

weight_func

The function that returns the weights (or lfdr's). Many forms of input are allowed. See weight_func_input for details.

weight_args

Additional arguments to pass to weight_func.

fa_func

A factor analysis function. The function must have as inputs a numeric matrix Y and a rank (numeric scalar) r. It must output numeric matrices alpha and Z and a numeric vector sig_diag. alpha is the estimate of the coefficients of the unobserved confounders, so it must be an r by ncol(Y) matrix. Z must be an r by nrow(Y) matrix. sig_diag is the estimate of the column-wise variances so it must be of length ncol(Y). The default is the function pca_naive that just uses the first r singular vectors as the estimate of alpha. The estimated variances are just the column-wise mean square.

fa_args

A list. Additional arguments you want to pass to fa_func.

scale_var

A logical. Should we scale the variance (TRUE) or not (FALSE)?

include_intercept

A logical. If TRUE, then it will check X to see if it has an intercept term. If not, then it will add an intercept term. If FALSE, then X will be unchanged.

weight_init

A character. How should we initialize the weights? The options are to initialize in the all-null setting ("all_null"), draw the weights randomly form iid uniforms ("random"), or run an iteration of weight_func prior to the first round of estimating the confounders ("limma"). This last step uses limma-ebayes.

weight_func_input

The form of input for weight_func. Right now only "summary2" is supported, but I intend to support all of the following in the future. If weight_func_input = "summary1" then the function only takes p-values as input (called pvalues). If weight_func_input = "summary2", then the function only takes a vector of effect estimates betahat, a vector of standard errors sebetahat, and a vector of degrees of freedom degrees_freedom. If weight = "summary3", then the input is a matrix of effects betamat, an array of covariances cov_array where the each cov_array[,,i] is the covariance of the elements of betamat[i, ], and a vector of degrees of freedom dfvec. If weight_func_input = "full", then the input is just a response matrix Y and a covariate matrix X.

degrees_freedom

if likelihood = "t", then this is the user-defined degrees of freedom for that distribution. If degrees_freedom is NULL then the degrees of freedom will be the sample size minus the number of covariates minus k.

min_scale

The minimum estimate for the variance inflation term.

Author(s)

David Gerard


dcgerard/vicar documentation built on July 7, 2021, 1:08 p.m.