dot-WriteData: Write Java MCMC format data file

Description Usage Arguments Value Author(s)

Description

Writes a data file in the format required for Java MCMC program

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
.WriteData(
  data.file,
  likelihood,
  data,
  outcome.var = NULL,
  times.var = NULL,
  confounders = NULL,
  predictors = NULL,
  model.space.priors,
  beta.priors = NULL,
  beta.prior.partitions = NULL,
  g.prior = FALSE,
  model.tau = FALSE,
  tau = NULL,
  xtx.ridge.term = 0,
  enumerate.up.to.dim = 0,
  n = n,
  xTx = NULL,
  z = NULL,
  ns.each.ethnicity = NULL,
  initial.model = NULL,
  trait.variance = NULL,
  logistic.likelihood.weights = NULL,
  mrloss.w = 0,
  mrloss.function = "variance",
  mrloss.marginal.causal.effects = NULL,
  mrloss.marginal.causal.effect.ses = NULL,
  mafs.if.independent = mafs.if.independent,
  debug = FALSE
)

Arguments

data.file

Desired path for .txt data file to be written to

likelihood

Type of model to fit. Current options are "Logistic" (for binary data), "CLogLog" complementary log-log link for binary data, "Weibull" (for survival data), "Linear" (for linear regression), "LinearConj" (linear regression exploiting conjugate results), "JAM" (for conjugate linear regression using summary statistics, integrating out parameters) and "JAM_MCMC" (for linear regression using summary statistics, with full MCMC of all parameters).

data

Matrix or dataframe containing the data to analyse. Rows are indiviuals, and columns contain the variables and outcome. If modelling summary statistics specify X.ref, marginal.betas, and n instead (see below).

outcome.var

Name of outcome variable in data. For survival data see times.var below. If modelling summary statistics with JAM this can be left null but you must specify X.ref, marginal.beats and n instead (see below).

times.var

SURVIVAL DATA ONLY Name of column in data which contains the event times.

confounders

Optional vector of confounders to fix in the model at all times, i.e. exclude from model selection.

model.space.priors

Must be specified if model.selection is set to TRUE. Two options are available. 1) A fixed prior is placed on the proportion of causal covariates, and all models with the same number of covariates is equally likely. This is effectively a Poisson prior over the different possible model sizes. A list must be supplied for 'model.space.priors' with an element "Rate", specifying the prior proportion of causal covariates, and an element "Variables" containing the list of covariates included in the model search. 2) The prior proportion of causal covariates is treated as unknown and given a Beta(a, b) hyper-prior, in which case elements "a" and "b" must be included in the 'model.space.priors' list rather than "Rate". Higher values of "b" relative to "a" will encourage sparsity. NOTE: It is easy to specify different model space priors for different collections of covariates by providing a list of lists, each element of which is itself a model.space.prior list asm described above for a particular subset of the covariates.

beta.priors

This allows specifying fixed (potentially informative) priors for the covariate effect priors. A matrix must be passed, with named rows corresponding to parameters, and columns corresponding to the prior mean and variance in that order. When using this option priors must be specified for either just the confounders, which are otherwise given fixed N(0,1e6) priors, or for all covariates.

beta.prior.partitions

Covariate effects under variable selection are ascribed, by default, a common Normal prior, the standard deviation of which is treated as unknown, with a Unif(0.05,2) hyper-prior. This option can be used to partition the covariate effects into different prior groups, each with a seperate hierarchical normal prior. beta.prior.partitions must be a list with as many elements as desired covariate groups. The element for a particular group must in turn be a list containing the following named elements: "Variables" - a list of covariates in the prior group, and "UniformA" and "UniformB" the Uniform hyper parameters for the standard deviation of the normal prior across their effects.

g.prior

Whether to use a g-prior for the beta's, i.e. a multivariate normal with correlation structure proportional to sigma^2*X'X^-1, which is thought to aid variable selection in the presence of strong correlation. By default this is enabled.

tau

Value to use for sparsity parameter tau (under the tau*sigma^2 parameterisation). When using the g-prior, a recommended default is max(n, P^2) where n is the number of individuals, and P is the number of predictors.

xtx.ridge.term

Value to add to the constant of the diagonal of X'X before JAM takes the Cholesky decomposition.

enumerate.up.to.dim

Whether to make posterior inference by exhaustively calculating the posterior support for every possible model up to this dimension. Leaving at 0 to disable and use RJMCMC instead. The current maximum allowed value is 5.

n

The size of the dataset in which the summary statistics (marginal.betas) were calculated

ns.each.ethnicity

For mJAM: A vector of the sizes of each ethnicity dataset in which the summary statistics were calculated.

initial.model

An initial model for the covariates under selection can be specified as a vector of 0s and 1s. If left un-specified the null (empty) model is used.

logistic.likelihood.weights

An optional vector of likelihood weights for logistic regression. These weights multiply the log-likeihood contribution of each individual. The order should match the order of rows in the data matrix.

mrloss.w

The relative weight of the MR log loss function for pleiotropy vs the log likelihood. Default 0.

mrloss.function

Choice of pleiotropic loss function from "steve", "variance" (default variance)

mafs.if.independent

If the SNPs are independent then a reference genotype matrix is not required. However, it is still necessary to provide SNP MAFs here as a named vector. Doing so will lead to X.ref being ignored and the SNPs to be modelled as if they are independent. Note that this option does not work with enumeration.

debug

Whether to output extra information (such as final adaption proposal SDs) which might help with debugging (default is FALSE).

Value

NA

Author(s)

Paul Newcombe


pjnewcombe/R2BGLiMS documentation built on Feb. 10, 2020, 8:52 p.m.