succotash: Surrogate and Confounder Correction Occuring Together with...
In dcgerard/succotashr: Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage

Description Usage Arguments Details Value See Also

This function implements the full SUCCOTASH method. First, it rotates the response and explanatory variables into a part that we use to estimate the confounding variables and the variances, and a part that we use to estimate the coefficients of the observed covariates. This function will implement a factor analysis for the first part then run succotash_given_alpha for the second part.

succotash(Y, X, k = NULL, sig_reg = 0.01, num_em_runs = 2,
  z_start_sd = 1, two_step = TRUE, fa_method = c("pca", "reg_mle",
  "quasi_mle", "homoPCA", "pca_shrinkvar", "mod_fa", "flash_hetero", "non_homo",
  "non_hetero", "non_shrinkvar"), lambda_type = c("zero_conc", "ones"),
  mix_type = c("normal", "uniform"), likelihood = c("normal", "t"),
  lambda0 = 10, tau_seq = NULL, em_pi_init = NULL,
  plot_new_ests = FALSE, em_itermax = 200, var_scale = TRUE,
  inflate_var = 1, optmethod = c("coord", "em"), use_ols_se = FALSE,
  z_init_type = c("null_mle", "random"), var_scale_init_type = c("null_mle",
  "one", "random"))

`Y`	An `n` by `p` matrix of response variables.
`X`	An `n` by `q` matrix of covariates. Only the variable in the last column is of interest.
`k`	An integer. The number of hidden confounders. If `NULL` and `sva` is installed, this will be estimated, by the `num.sv` function in the `sva` package available on Bioconductor.
`sig_reg`	A numeric. If `fa_method` is `"reg_mle"`, then this is the value of the regularization parameter.
`num_em_runs`	An integer. The number of times we should run the EM algorithm.
`z_start_sd`	A positive numeric. At the beginning of each EM algorithm, `Z` is initiated with independent mean zero normals with standard deviation `z_start_sd`.
`two_step`	A logical. Should we run the two-step SUCCOTASH procedure of inflating the variance (`TRUE`) or not (`FALSE`)? Defaults to `TRUE`.
`fa_method`	Which factor analysis method should we use? The regularized MLE implemented in `factor_mle` (`"reg_mle"`), two methods fromthe package `cate`: the quasi-MLE (`"quasi_mle"`) from Bai and Li (2012), just naive PCA (`"pca"`), FLASH (`"flash_hetero"`), homoscedastic PCA (`"homoPCA"`), PCA followed by shrinking the variances using limma (`"pca_shrinkvar"`), or moderated factor analysis (`"mod_fa"`). Three methods for no confounder adjustment are available, `"non_homo"`, `"non_shrinkvar"`, and `"non_hetero"`.
`lambda_type`	See `succotash_given_alpha` for options on the regularization parameter of the mixing proportions.
`mix_type`	Should the prior be a mixture of normals `mix_type = 'normal'` or a mixture of uniforms `mix_type = 'uniform'`?
`likelihood`	Which likelihood should we use? Normal (`"normal"`) or t (`"t"`)?
`lambda0`	If `lambda_type = "zero_conc"`, then `lambda0` is the amount to penalize `pi0`.
`tau_seq`	A vector of length `M` containing the standard deviations (not variances) of the mixing distributions.
`em_pi_init`	A vector of length `M` containing the starting values of π. If `NULL`, then one of three options are implemented in calculating `pi_init` based on the value of `pi_init_type`. Only available in normal mixtures for now.
`plot_new_ests`	A logical. Should we plot the mixing proportions at each iteration of the EM algorithm?
`em_itermax`	A positive numeric. The maximum number of iterations to run during the EM algorithm.
`var_scale`	A logical. Should we update the scaling on the variances (`TRUE`) or not (`FALSE`). Only works for the normal mixtures case right now. Defaults to `TRUE`.
`inflate_var`	A positive numeric. The multiplicative amount to inflate the variance estimates by. There is no theoretical justification for it to be anything but 1, but I have it in here to play around with it.
`optmethod`	Either coordinate ascent (`"coord"`) or an EM algorithm (`"em"`). Coordinate ascent is currently only implemented in the uniform mixtures case, for which it is the default.
`use_ols_se`	A logical. Should we use the standard formulas for OLS of X on Y to get the estimates of the variances (`TRUE`) or not (`FALSE`)
`z_init_type`	How should we initiate the confounders? At the all-null MLE (`"null_mle"`) or from iid standard normals (`"random"`)?
`var_scale_init_type`	If `var_scale = TRUE`, how should we initiate the variance inflaiton parameter? From the all-null MLE (`"null_mle"`), at no inflation (`"one"`), or from a chi-squared distribution with one degree of freedom (`"random"`)?

The assumed mode is

Y = Xβ + Zα + E.

Y is a n by p matrix of response varaibles. For example, each row might be an array of log-transformed and quantile normalized gene-expression data. X is a n by q matrix of observed covariates. It is assumed that all but the last column of which contains nuisance parameters. For example, the first column might be a vector of ones to include an intercept. β is a q by p matrix of corresponding coefficients. Z is a n by k matrix of confounder variables. α is the corresponding k by p matrix of coefficients for the unobserved confounders. E is a n by p matrix of error terms. E is assumed to be matrix normal with identity row covariance and diagonal column covariance Σ. That is, the columns are heteroscedastic while the rows are homoscedastic independent.

This function will first rotate Y and X using the QR decomposition. This separates the model into three parts. The first part only contains nuisance parameters, the second part contains the coefficients of interest, and the third part contains the confounders. succotash applies a factor analysis to the third part to estimate the confounding factors, then runs an EM algorithm on the second part to estimate the coefficients of interest.

Many forms of factor analyses are avaiable. The default is PCA with the column-wise residual mean-squares as the estimates of the column-wise variances.

See succotash_given_alpha for details of output.

Y1_scaled The OLS estimates.

sig_diag_scaled The estimated standard errors of the estimated effects (calculated from the factor analysis step) times scale_val.

sig_diag The estimates of the gene-wise variances (but not times scale_val).

pi0 A non-negative numeric. The marginal probability of zero.

alpha_scaled The scaled version of the estimated coefficients of the hidden confounders.

Z A vector of numerics. Estimated rotated confounder in second step of succotash.

pi_vals A vector of numerics between 0 and 1. The mixing proportions.

tau_seq A vector of non-negative numerics. The mixing standard deviations (not variances).

lfdr A vector of numerics between 0 and 1. The local false discovery rate. I.e. the posterior probability of a coefficient being zero.

lfsr A vector of numerics between 0 and 1. The local false sign rate. I.e. the posterior probability of making a sign error if one chose the most probable sign.

qvals A vector of numerics between 0 and 1. The q-values. The average error rate if we reject all hypotheses that have smaller q-value.

betahat A vector of numerics. The posterior mean of the coefficients.

succotash_given_alpha, factor_mle, succotash_summaries.

dcgerard/succotashr documentation built on May 15, 2019, 1:25 a.m.

dcgerard/succotashr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dcgerard/succotashr
Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage

succotash: Surrogate and Confounder Correction Occuring Together with...
In dcgerard/succotashr: Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage

Description

Usage

Arguments

Details

Value

See Also

Related to succotash in dcgerard/succotashr...

R Package Documentation

Browse R Packages

We want your feedback!

dcgerard/succotashr Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage

succotash: Surrogate and Confounder Correction Occuring Together with... In dcgerard/succotashr: Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage

Description

Usage

Arguments

Details

Value

See Also

Related to succotash in dcgerard/succotashr...

R Package Documentation

Browse R Packages

We want your feedback!

dcgerard/succotashr
Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage

succotash: Surrogate and Confounder Correction Occuring Together with...
In dcgerard/succotashr: Surrogate and Confounder Correction Occuring Together with Adaptive Shrinkage