mash: Apply mash method to data

View source: R/mash.R

mashR Documentation

Apply mash method to data

Description

Apply mash method to data

Usage

mash(
  data,
  Ulist = NULL,
  gridmult = sqrt(2),
  grid = NULL,
  normalizeU = TRUE,
  usepointmass = TRUE,
  g = NULL,
  fixg = FALSE,
  prior = c("nullbiased", "uniform"),
  nullweight = 10,
  optmethod = c("mixSQP", "mixIP", "mixEM", "cxxMixSquarem"),
  control = list(),
  verbose = TRUE,
  add.mem.profile = FALSE,
  algorithm.version = c("Rcpp", "R"),
  pi_thresh = 1e-10,
  A = NULL,
  posterior_samples = 0,
  seed = 123,
  outputlevel = 2,
  output_lfdr = FALSE
)

Arguments

data

a mash data object containing the Bhat matrix, standard errors, alpha value; created using mash_set_data or mash_set_data_contrast

Ulist

a list of covariance matrices to use (see normalizeU for rescaling these matrices)

gridmult

scalar indicating factor by which adjacent grid values should differ; close to 1 for fine grid

grid

vector of grid values to use (scaling factors omega in paper)

normalizeU

whether or not to normalize the U covariances to have maximum of 1 on diagonal

usepointmass

whether to include a point mass at 0, corresponding to null in every condition

g

the value of g obtained from a previous mash fit - an alternative to supplying Ulist, grid and usepointmass

fixg

if g is supplied, allows the mixture proportions to be fixed rather than estimated; e.g., useful for fitting mash to test data after fitting it to training data

prior

indicates what penalty to use on the likelihood, if any

nullweight

scalar, the weight put on the prior under “nullbiased” specification, see “prior”.

optmethod

name of optimization method to use

control

A list of control parameters passed to optmethod.

verbose

If TRUE, print progress to R console.

add.mem.profile

If TRUE, print memory usage to R console (requires R library 'profmem').

algorithm.version

Indicates whether to use R or Rcpp version

pi_thresh

threshold below which mixture components are ignored in computing posterior summaries (to speed calculations by ignoring negligible components)

A

the linear transformation matrix, Q x R matrix. This is used to compute the posterior for Ab.

posterior_samples

the number of samples to be drawn from the posterior distribution of each effect.

seed

A random number seed to use when sampling from the posteriors. It is used when posterior_samples > 0.

outputlevel

controls amount of computation / output; 1: output only estimated mixture component proportions, 2: and posterior estimates, 3: and posterior covariance matrices, 4: and likelihood matrices

output_lfdr

If output_lfdr = TRUE, output local false discovery rate estimates. The lfdr tends to be sensitive to mis-estimated covariance matrices, and generally we do not recommend using them; we recommend using the local false sign rate (lfsr) instead, which is always returned, even when output_lfdr = TRUE.

Value

a list with elements result, loglik and fitted_g

Examples

Bhat     = matrix(rnorm(100),ncol=5) # create some simulated data
Shat     = matrix(rep(1,100),ncol=5)
data     = mash_set_data(Bhat,Shat, alpha=1)
U.c      = cov_canonical(data)
res.mash = mash(data,U.c)

# Run mash with penalty exponent on null term equal to 100.
# See "False disovery rates: a new deal" (M. Stephens 2017),
# supplementary material S.2.5 for more details.
set.seed(1)
simdata = simple_sims(500,5,1)
data    = mash_set_data(simdata$Bhat,simdata$Shat)
U.c     = cov_canonical(data)
res0    = mash(data,U.c)
res1    = mash(data,U.c,prior = "nullbiased",nullweight = 101)
plot(res0$fitted_g$pi,res1$fitted_g$pi,pch = 20)
abline(a = 0,b = 1,col = "skyblue",lty = "dashed")


mashr documentation built on Oct. 18, 2023, 5:08 p.m.