stramash.workhorse: Detailed Adaptive Shrinkage function

Description Usage Arguments Details Value See Also Examples

Description

Takes vectors of estimates (betahat) and their standard errors (sebetahat) or a list of mixture error distributions (errordist), and applies shrinkage to them, using Empirical Bayes methods, to compute shrunk estimates for beta. This is the more detailed version of stramash for "research" use. Most users will be happy with the stramash function, which provides the same usage, but documents only the main options for simplicity.

Usage

1
2
3
4
5
6
7
8
stramash.workhorse(betahat, errordist = NULL, sebetahat = NULL,
  likelihood = c("normal", "t", "laplace"), gridsize = 100,
  method = c("fdr", "shrink"), mixcompdist = c("uniform", "halfuniform",
  "normal", "+uniform", "-uniform"), optmethod = c("mixIP", "mixEM",
  "mixVBEM"), df = NULL, randomstart = FALSE, nullweight = 10,
  nonzeromode = FALSE, pointmass = NULL, prior = c("nullbiased",
  "uniform", "unit"), mixsd = NULL, gridmult = sqrt(2), outputlevel = 2,
  g = NULL, fixg = FALSE, model = c("EE", "ET"), control = list())

Arguments

betahat

a p vector of estimates

errordist

A list of objects of either class normalmix or unimix. The length of this list must be the length of betahat. Note that errordist[[i]] is the ith error distribution of betahat[i]. Defaults to NULL, in which case stramash.workhorse will assume either a normal, t, or Laplace likelihood, depending on the value for likelihood.

sebetahat

a p vector of corresponding standard errors

likelihood

One of the pre-specified likelihoods available. So far, they are "normal", "t", and "laplace".

gridsize

The size of the grid if you are using one of the pre-specified likelihoods.

method

specifies how stramash is to be run. Can be "shrinkage" (if main aim is shrinkage) or "fdr" (if main aim is to assess fdr or fsr) This is simply a convenient way to specify certain combinations of parameters: "shrinkage" sets pointmass = FALSE and prior = "uniform"; "fdr" sets pointmass = TRUE and prior = "nullbiased".

mixcompdist

distribution of components in mixture ("uniform", "halfuniform", "normal", "+uniform", or "-uniform"), the default is "uniform". If you believe your effects may be asymmetric, use "halfuniform". If you want to allow only positive/negative effects use "+uniform"/"-uniform".

optmethod

specifies optimization method used. Default is "mixIP", an interior point method, if REBayes is installed; otherwise an EM algorithm is used. The interior point method is faster for large problems (n>2000).

df

appropriate degrees of freedom for (t) distribution of betahat/sebetahat if likelihood = "t".

randomstart

logical, indicating whether to initialize EM randomly. If FALSE, then initializes to prior mean (for EM algorithm) or prior (for VBEM)

nullweight

scalar, the weight put on the prior under "nullbiased" specification, see prior

nonzeromode

logical, indicating whether to use a non-zero unimodal mixture(default is FALSE)

pointmass

logical, indicating whether to use a point mass at zero as one of components for a mixture distribution

prior

string, or numeric vector indicating Dirichlet prior on mixture proportions (defaults to "uniform", or (1,1...,1); also can be "nullbiased" (nullweight,1,...,1) to put more weight on first component), or "unit" (1/K,...,1/K) [for optmethod = mixVBEM version only]

mixsd

vector of standard deviations for underlying mixture components for the prior.

gridmult

the multiplier by which the default grid values for mixsd differ by one another. (Smaller values produce finer grids).

outputlevel

determines amount of output [0 = just fitted g; 1 = also PosteriorMean and PosteriorSD; 2 = everything usually needed; 3 = also include results of mixture fitting procedure (includes matrix of log-likelihoods used to fit mixture).

g

the prior distribution for beta (usually estimated from the data; this is used primarily in simulated data to do computations with the "true" g)

fixg

if TRUE, don't estimate g but use the specified g - useful for computations under the "true" g in simulations

model

c("EE","ET") specifies whether to assume exchangeable effects (EE) or exchangeable T stats (ET).

control

A list of control parameters for the optmization algorithm. Default value is set to be control.default = list(K = 1, method = 3, square = TRUE, step.min0 = 1, step.max0 = 1, mstep = 4, kr = 1, objfn.inc = 1, tol = 1.e-07, maxiter = 5000, trace = FALSE). User may supply changes to this list of parameter, say, control = list(maxiter = 10000, trace = TRUE).

Details

See README.md for more details.

Value

stramash returns an object of class "ash", a list with some or all of the following elements (determined by outputlevel)

fitted.g

fitted mixture, either a normalmix or unimix

loglik

log P(D|mle(pi))

logLR

log[P(D|mle(pi))/P(D|beta==0)]

PosteriorMean

A vector consisting the posterior mean of beta from the mixture

PosteriorSD

A vector consisting the corresponding posterior standard deviation

PositiveProb

A vector of posterior probability that beta is positive

NegativeProb

A vector of posterior probability that beta is negative

ZeroProb

A vector of posterior probability that beta is zero

lfsr

The local false sign rate

lfdr

A vector of estimated local false discovery rate

qvalue

A vector of q values

svalue

A vector of s values

call

a call in which all of the specified arguments are specified by their full names

excludeindex

the vector of index of observations with 0 standard error; if none, then returns NULL

model

either "EE" or "ET", denoting whether exchangeable effects (EE) or exchangeable T stats (ET) has been used

optmethod

the optimization method used

data

a list consisting the input betahat and sebetahat (only included if outputlevel>2)

fit

a list containing results of mixture optimization, and matrix of component log-likelihoods used in this optimization

See Also

stramash for simplified specification of stramash function

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
beta = c(rep(0, 100), stats::rnorm(100))
sebetahat = abs(stats::rnorm(200, 0, 1))
betahat = stats::rnorm(200, beta, sebetahat)
beta.stramash = stramash.workhorse(betahat = betahat, sebetahat = sebetahat)
names(beta.stramash)
summary(beta.stramash)
head(as.data.frame(beta.stramash))
graphics::plot(betahat, beta.stramash$PosteriorMean, xlim=c(-4, 4), ylim=c(-4, 4))

## Testing the non-zero mode feature
betahat = betahat + 5
beta.stramash = stramash.workhorse(betahat = betahat, sebetahat = sebetahat)
graphics::plot(betahat, beta.stramash$PosteriorMean)
summary(beta.stramash)

## Running stramash with a pre-specified g, rather than estimating it
beta = c(rep(0, 100), stats::rnorm(100))
sebetahat = abs(stats::rnorm(200, 0, 1))
betahat = stats::rnorm(200, beta, sebetahat)
true_g = ashr::normalmix(c(0.5, 0.5), c(0, 0), c(0, 1)) # define true g
## Passing this g into stramash causes it to
## i) take the sd and the means for each component from this g, and
## ii) initialize pi to the value from this g.
beta.stramash = stramash.workhorse(betahat = betahat, sebetahat = sebetahat,
                                   g = true_g, fixg = TRUE)

dcgerard/stramash documentation built on May 15, 2019, 1:24 a.m.