bad_ash.workhorse: Detailed Adaptive Shrinkage function

Description Usage Arguments Details Value See Also Examples

Description

Takes vectors of estimates (betahat) and their standard errors (sebetahat), and applies shrinkage to them, using Empirical Bayes methods, to compute shrunk estimates for beta. This is the more detailed version of ash for "research" use. Most users will be happy with the ash function, which provides the same usage, but documents only the main options for simplicity.

Usage

1
2
3
4
5
6
bad_ash.workhorse(betahat, sebetahat, method = c("fdr", "shrink"),
  mixcompdist = c("uniform", "halfuniform", "normal", "+uniform", "-uniform"),
  optmethod = c("mixIP", "cxxMixSquarem", "mixEM", "mixVBEM"), df = NULL,
  nullweight = 10, pointmass = TRUE, prior = c("nullbiased", "uniform",
  "unit"), mixsd = NULL, gridmult = sqrt(2), outputlevel = 2, g = NULL,
  fixg = FALSE, mode = 0, alpha = 0, control = list(), lik = NULL)

Arguments

betahat

a p vector of estimates

sebetahat

a p vector of corresponding standard errors

method

specifies how ash is to be run. Can be "shrinkage" (if main aim is shrinkage) or "fdr" (if main aim is to assess fdr or fsr) This is simply a convenient way to specify certain combinations of parameters: "shrinkage" sets pointmass=FALSE and prior="uniform"; "fdr" sets pointmass=TRUE and prior="nullbiased".

mixcompdist

distribution of components in mixture ( "uniform","halfuniform","normal" or "+uniform"), the default value is "uniform" use "halfuniform" to allow for assymetric g, and "+uniform"/"-uniform" to constrain g to be positive/negative.

optmethod

specifies the function implementing an optimization method. Default is "mixIP", an interior point method, if REBayes is installed; otherwise an EM algorithm is used. The interior point method is faster for large problems (n>2000), particularly when method="shrink".

df

appropriate degrees of freedom for (t) distribution of betahat/sebetahat, default is NULL(Gaussian)

nullweight

scalar, the weight put on the prior under "nullbiased" specification, see prior

pointmass

logical, indicating whether to use a point mass at zero as one of components for a mixture distribution

prior

string, or numeric vector indicating Dirichlet prior on mixture proportions (defaults to "uniform", or (1,1...,1); also can be "nullbiased" (nullweight,1,...,1) to put more weight on first component), or "unit" (1/K,...,1/K) [for optmethod=mixVBEM version only]

mixsd

vector of sds for underlying mixture components

gridmult

the multiplier by which the default grid values for mixsd differ by one another. (Smaller values produce finer grids)

outputlevel

determines amount of output. There are several numeric options [0=just fitted g; 1=also PosteriorMean and PosteriorSD; 2= everything usually needed; 3=also include results of mixture fitting procedure (includes matrix of log-likelihoods used to fit mixture); 4= output additional things required by flash (flash_data)]. Otherwise the user can also specify the output they require in detail (see Examples)

g

the prior distribution for beta (usually estimated from the data; this is used primarily in simulated data to do computations with the "true" g)

fixg

if TRUE, don't estimate g but use the specified g - useful for computations under the "true" g in simulations

mode

either numeric (indicating mode of g) or string "estimate", to indicate mode should be estimated.

alpha

numeric value of alpha parameter in the model

control

A list of control parameters passed to optmethod

lik

contains details of the likelihood used; for general ash

Details

See readme for more details.

Value

ash returns an object of class "ash", a list with some or all of the following elements (determined by outputlevel)

fitted_g

fitted mixture, either a normalmix or unimix

loglik

log P(D|mle(pi))

logLR

log[P(D|mle(pi))/P(D|beta==0)]

result

A dataframe whose columns are

NegativeProb

A vector of posterior probability that beta is negative

PositiveProb

A vector of posterior probability that beta is positive

lfsr

A vector of estimated local false sign rate

lfdr

A vector of estimated local false discovery rate

qvalue

A vector of q values

svalue

A vector of s values

PosteriorMean

A vector consisting the posterior mean of beta from the mixture

PosteriorSD

A vector consisting the corresponding posterior standard deviation

call

a call in which all of the specified arguments are specified by their full names

data

a list containing details of the data and models used (mostly for internal use)

fit_details

a list containing results of mixture optimization, and matrix of component log-likelihoods used in this optimization

See Also

ash for simplified specification of ash function

ashci for computation of credible intervals after getting the ash object return by ash()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
beta = c(rep(0,100),rnorm(100))
sebetahat = abs(rnorm(200,0,1))
betahat = rnorm(200,beta,sebetahat)
beta.ash = ash(betahat, sebetahat)
names(beta.ash)
head(beta.ash$result) #dataframe of results
head(get_lfsr(beta.ash)) #get lfsr
head(get_pm(beta.ash)) #get posterior mean
graphics::plot(betahat,get_pm(beta.ash),xlim=c(-4,4),ylim=c(-4,4))

CIMatrix=ashci(beta.ash,level=0.95) #note currently default is only compute CIs for lfsr<0.05
print(CIMatrix)

#Running ash with different error models
beta.ash1 = ash(betahat, sebetahat, lik = normal_lik())
beta.ash2 = ash(betahat, sebetahat, lik = t_lik(df=4))

e = rnorm(100)+log(rf(100,df1=10,df2=10)) # simulated data with log(F) error
e.ash = ash(e,1,lik=logF_lik(df1=10,df2=10))


# Specifying the output
beta.ash = ash(betahat, sebetahat, output = c("fitted_g","logLR","lfsr"))

#Illustrating the non-zero mode feature
betahat=betahat+5
beta.ash = ash(betahat, sebetahat)
graphics::plot(betahat,beta.ash$result$PosteriorMean)
betan.ash=ash(betahat, sebetahat,mode=5)
graphics::plot(betahat, betan.ash$result$PosteriorMean)
summary(betan.ash)


#Running ash with a pre-specified g, rather than estimating it
beta = c(rep(0,100),rnorm(100))
sebetahat = abs(rnorm(200,0,1))
betahat = rnorm(200,beta,sebetahat)
true_g = normalmix(c(0.5,0.5),c(0,0),c(0,1)) # define true g
## Passing this g into ash causes it to i) take the sd and the means
## for each component from this g, and ii) initialize pi to the value
## from this g.
beta.ash = ash(betahat, sebetahat,g=true_g,fixg=TRUE)

dcgerard/sabotash documentation built on May 15, 2019, 1:24 a.m.