GridLMM_posterior: Evaluate the posterior of a linear mixed model using the...

View source: R/GridLMM.R

GridLMM_posteriorR Documentation

Evaluate the posterior of a linear mixed model using the Grid-sampling algorithm

Description

Performs approximate posterior inference of a linear mixed model by evaluating the posterior over a grid of values of the variance component proportions. The grid evaluation uses a heuristic to avoid traversing the entire grid. This should work well as long as the posterior is unimodal.

Usage

GridLMM_posterior(
  formula,
  data,
  weights = NULL,
  relmat = NULL,
  normalize_relmat = TRUE,
  h2_divisions = 10,
  h2_prior = function(h2s, n) 1/n,
  a = 0,
  b = 0,
  inv_prior_X = 0,
  target_prob = 0.99,
  thresh_nonzero = 10,
  thresh_nonzero_marginal = 0,
  V_setup = NULL,
  save_V_folder = NULL,
  diagonalize = T,
  mc.cores = my_detectCores(),
  verbose = T
)

Arguments

formula

A two-sided linear formula as used in lmer describing the fixed-effects and random-effects of the model on the RHS and the response on the LHS. Note: correlated random-effects are not implemented, so using one or two vertical bars (|) or one is identical. At least one random effect is needed.

data

A data frame containing the variables named in formula.

weights

An optional vector of observation-specific weights.

relmat

A list of matrices that are proportional to the (within) covariance structures of the group level effects. The names of the matrices should correspond to the columns in data that are used as grouping factors. All levels of the grouping factor should appear as rownames of the corresponding matrix.

normalize_relmat

Should the relmats be normalized to mean(diag)==1?

h2_divisions

Starting number of divisions of the grid for each variance component.

h2_prior

Function that takes two arguments: 1) A vector of h2s (ie variance component proportions), and 2) An integer giving the number of vertices in the full grid. The function should return a non-negative value giving the prior weight to the grid cell corresponding to h2s.

a, b

Shape and Rate parameters of the Gamma prior for the residual variance \sigma^2. Setting both to zero gives a limiting "default" prior.

inv_prior_X

Vector of values for the prior precision of each of the fixed effects (including an intercept). Will be recycled if necessary.

target_prob

See Details.

thresh_nonzero

See Details.

thresh_nonzero_marginal

See Details.

V_setup

Optional. A list produced by a GridLMM function containing the pre-processed V decompositions for each grid vertex, or the information necessary to create this. Generally saved from a previous run of GridLMM on the same data.

save_V_folder

Optional. A character vector giving a folder to save pre-processed V decomposition files for future / repeated use. If null, V decompositions are stored in memory

diagonalize

If TRUE and the model includes only a single random effect, the "GEMMA" trick will be used to diagonalize V. This is done by calculating the SVD of K, which can be slow for large samples.

mc.cores

Number of cores to use for parallel evaluations.

verbose

Should progress be printed to the screen?

svd_K

If TRUE and nrow(K) < nrow(Z), then the SVD is done on K instead of ZKZ^T

drop0_tol

Values closer to zero than this will be set to zero when forming sparse matrices.

Details

Posterior inference involves an adaptive grid search. Generally, we start with a very coarse grid (with as few as 2-3 vertices per variance component) and then progressively increase the grid resolution focusing only on regions of high posterior probability. This is controlled by h2_divisions, target_prob, thresh_nonzero, and thresh_nonzero_matrginal. The sampling algorithm is as follows:

  • Start by evaluating the posterior at each vertex of a trial grid with resolution m

  • Find the minimum number of vertices needed to sum to target_prob of the current (discrete) posterior. Repeat for the marginal posteriors of each variance component#'

  • If these numbers are smaller than thresh_nonzero or thresh_nonzero_matrginal, respectively, form a new grid by increasing the grid resolution to m/2. Otherwise, STOP.

  • Begin evaluating the posterior at the new grid only at those grid vertices that are adjacent (in any dimension) to any of the top grid vertices in the old grid.

  • Re-evaluate the distribution of the posterior over the new grid. If any new vertices contribute to the top target_prob fraction of the overall posterior, include these in the "top" set and return to step 4. Note - the prior weights for the grid vertices must be updated each time the grid increases in resolution.

  • Repeat steps 4-5 until no new grid vertices contribute to the "top" set.

  • Repeat steps 2-6 until a STOP is reached at step 3.

Note: Default parameters for priors give flat (improper) priors. These should be used with care, especially for calculations of Bayes Factors.

Value

A list with three elements:

h2s_results

A data frame with each row an evaluated grid vertex, with the first l columns giving the h^2's, and the final column the corresponding posterior mass

h2s_solutions

A list with the parameters of the NIG distribution for each grid vertex

V_setup

The V_setup object for this model. Can be re-passed to this function (or other GridLMM functions) to re-fit the model to the same data.


deruncie/GridLMM documentation built on May 2, 2023, 7:18 p.m.