Description Usage Arguments Details Value See Also
This function implements the full SUCCOTASH method. First, it
rotates the response and explanatory variables into a part that we
use to estimate the confounding variables and the variances, and a
part that we use to estimate the coefficients of the observed
covariates. This function will implement a factor analysis for the
first part then run succotash_given_alpha
for the
second part.
1 2 3 4 5 6 7 8 9 10 | succotash(Y, X, k = NULL, sig_reg = 0.01, num_em_runs = 2,
z_start_sd = 1, two_step = TRUE, fa_method = c("pca", "reg_mle",
"quasi_mle", "homoPCA", "pca_shrinkvar", "mod_fa", "flash_hetero", "non_homo",
"non_hetero", "non_shrinkvar"), lambda_type = c("zero_conc", "ones"),
mix_type = c("normal", "uniform"), likelihood = c("normal", "t"),
lambda0 = 10, tau_seq = NULL, em_pi_init = NULL,
plot_new_ests = FALSE, em_itermax = 200, var_scale = TRUE,
inflate_var = 1, optmethod = c("coord", "em"), use_ols_se = FALSE,
z_init_type = c("null_mle", "random"), var_scale_init_type = c("null_mle",
"one", "random"))
|
Y |
An |
X |
An |
k |
An integer. The number of hidden confounders. If
|
sig_reg |
A numeric. If |
num_em_runs |
An integer. The number of times we should run the EM algorithm. |
z_start_sd |
A positive numeric. At the beginning of each EM
algorithm, |
two_step |
A logical. Should we run the two-step SUCCOTASH
procedure of inflating the variance ( |
fa_method |
Which factor analysis method should we use? The
regularized MLE implemented in |
lambda_type |
See |
mix_type |
Should the prior be a mixture of normals
|
likelihood |
Which likelihood should we use? Normal
( |
lambda0 |
If |
tau_seq |
A vector of length |
em_pi_init |
A vector of length |
plot_new_ests |
A logical. Should we plot the mixing proportions at each iteration of the EM algorithm? |
em_itermax |
A positive numeric. The maximum number of iterations to run during the EM algorithm. |
var_scale |
A logical. Should we update the scaling on the
variances ( |
inflate_var |
A positive numeric. The multiplicative amount to inflate the variance estimates by. There is no theoretical justification for it to be anything but 1, but I have it in here to play around with it. |
optmethod |
Either coordinate ascent ( |
use_ols_se |
A logical. Should we use the standard formulas
for OLS of X on Y to get the estimates of the variances
( |
z_init_type |
How should we initiate the confounders? At the
all-null MLE ( |
var_scale_init_type |
If |
The assumed mode is
Y = Xβ + Zα + E.
Y is a
n by p
matrix of response varaibles. For example, each
row might be an array of log-transformed and quantile normalized
gene-expression data. X is a n by q matrix of
observed covariates. It is assumed that all but the last column of
which contains nuisance parameters. For example, the first column
might be a vector of ones to include an intercept. β is
a q by p matrix of corresponding coefficients. Z
is a n by k matrix of confounder
variables. α is the corresponding k by p
matrix of coefficients for the unobserved confounders. E is a
n by p matrix of error terms. E is assumed to be
matrix normal with identity row covariance and diagonal column
covariance Σ. That is, the columns are heteroscedastic
while the rows are homoscedastic independent.
This function will first rotate Y and X using the QR
decomposition. This separates the model into three parts. The first
part only contains nuisance parameters, the second part contains
the coefficients of interest, and the third part contains the
confounders. succotash
applies a factor analysis to the
third part to estimate the confounding factors, then runs an EM
algorithm on the second part to estimate the coefficients of
interest.
Many forms of factor analyses are avaiable. The default is PCA with the column-wise residual mean-squares as the estimates of the column-wise variances.
See succotash_given_alpha
for details of
output.
Y1_scaled
The OLS estimates.
sig_diag_scaled
The estimated standard errors of the
estimated effects (calculated from the factor analysis step)
times scale_val
.
sig_diag
The estimates of the gene-wise variances (but not
times scale_val
).
pi0
A non-negative numeric. The marginal probability of
zero.
alpha_scaled
The scaled version of the estimated
coefficients of the hidden confounders.
Z
A vector of numerics. Estimated rotated confounder in
second step of succotash.
pi_vals
A vector of numerics between 0 and 1. The mixing
proportions.
tau_seq
A vector of non-negative numerics. The mixing
standard deviations (not variances).
lfdr
A vector of numerics between 0 and 1. The local false
discovery rate. I.e. the posterior probability of a coefficient
being zero.
lfsr
A vector of numerics between 0 and 1. The local false
sign rate. I.e. the posterior probability of making a sign error
if one chose the most probable sign.
qvals
A vector of numerics between 0 and 1. The
q-values. The average error rate if we reject all hypotheses that
have smaller q-value.
betahat
A vector of numerics. The posterior mean of the
coefficients.
succotash_given_alpha
, factor_mle
,
succotash_summaries
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.