serosolver: Adaptive Metropolis-within-Gibbs/Metropolis Hastings Random...

View source: R/mcmc.R

serosolverR Documentation

Adaptive Metropolis-within-Gibbs/Metropolis Hastings Random Walk Algorithm.

Description

The Adaptive Metropolis-within-Gibbs algorithm. Given a starting point and the necessary MCMC parameters as set out below, performs a random-walk of the posterior space to produce an MCMC chain that can be used to generate MCMC density and iteration plots. The algorithm undergoes an adaptive period, where it changes the step size of the random walk for each parameter to approach the desired acceptance rate, target_acceptance_rate_theta. The algorithm then uses univ_proposal or mvr_proposal to explore parameter space, recording the value and posterior value at each step. The MCMC chain is saved in blocks as a .csv file at the location given by filename. This version of the algorithm is also designed to explore posterior densities for infection histories. See the package vignettes for examples.

Usage

serosolver(
  par_tab,
  antibody_data,
  antigenic_map = NULL,
  possible_exposure_times = NULL,
  mcmc_pars = c(),
  mvr_pars = NULL,
  n_chains = 1,
  parallel = FALSE,
  start_inf_hist = NULL,
  filename = "test",
  posterior_func = create_posterior_func,
  prior_func = NULL,
  prior_version = 2,
  measurement_indices = NULL,
  measurement_random_effects = FALSE,
  proposal_ratios = NULL,
  temp = 1,
  solve_likelihood = TRUE,
  n_alive = NULL,
  OPT_TUNING = 0.1,
  random_start_parameters = TRUE,
  start_level = "none",
  data_type = 1,
  verbose = TRUE,
  verbose_dev = FALSE,
  ...
)

Arguments

par_tab

The parameter table controlling information such as bounds, initial values etc. See example_par_tab

antibody_data

The data frame of titre data to be fitted. Must have columns: group (index of group); individual (integer ID of individual); samples (numeric time of sample taken); virus (numeric time of when the virus was circulating); titre (integer of titre value against the given virus at that sampling time); run (integer giving the repeated number of this titre); DOB (integer giving date of birth matching time units used in model). See example_antibody_data

antigenic_map

(optional) A data frame of antigenic x and y coordinates. Must have column names: x_coord; y_coord; inf_times. See example_antigenic_map

possible_exposure_times

(optional) this argument gives the vector of times at which individuals can be infected. Defaults to entries in antigenic_map.

mcmc_pars

Named vector named vector with parameters for the MCMC procedure. See details

mvr_pars

Leave NULL to use univariate proposals. Otherwise, a list of parameters if using a multivariate proposal. Must contain an initial covariance matrix, weighting for adapting cov matrix, and an initial scaling parameter (0-1)

start_inf_hist

Infection history matrix to start MCMC at. Can be left NULL. See example_inf_hist

filename

The full filepath at which the MCMC chain should be saved. "_chain.csv" will be appended to the end of this, so filename should have no file extensions

posterior_func

Pointer to posterior function used to calculate a likelihood. This will probably be create_posterior_func

prior_func

User function of prior for model parameters. Should take parameter values only

prior_version

which infection history assumption prior_version to use? See describe_priors for options. Can be 1, 2, 3 or 4

measurement_indices

optional NULL. For measurement bias function. Vector of indices of length equal to number of circulation times. For each year, gives the index of parameters named "rho" that correspond to each time period

measurement_random_effects

optional FALSE. Boolean indicating if measurement bias is a random effects term. If TRUE adds a component to the posterior calculation that calculates the probability of a set of measurement shifts "rho", given a mean and standard deviation

proposal_ratios

optional NULL. Can set the relative sampling weights of the infection state times. Should be an integer vector of length matching nrow(antigenic_map). Otherwise, leave as NULL for uniform sampling.

temp

Temperature term for parallel tempering, raises likelihood to this value. Just used for testing at this point

solve_likelihood

if FALSE, returns only the prior and does not solve the likelihood. Use this if you wish to sample directly from the prior

n_alive

if not NULL, uses this as the number alive for the infection history prior, rather than calculating the number alive based on antibody_data

OPT_TUNING

Constant describing the amount of leeway when adapting the proposals steps to reach a desired acceptance rate (ie. does not change step size if within OPT_TUNING of the specified acceptance rate)

random_start_parameters

if FALSE, uses whatever parameter values were passed in par_tab as the starting positions for the MCMC chain

verbose

if TRUE, prints progress updates during the run

verbose_dev

if TRUE, prints additional messages regarding step sizes, acceptance rates etc

...

Other arguments to pass to posterior_func

Details

The mcmc_pars argument has the following options:

  • iterations (number of post adaptive period iterations to run)

  • adaptive_iterations (for this many iterations, change proposal step size adaptively every adaptive_frequency iterations)

  • adaptive_frequency (adapt proposal step size every adaptive_frequency iterations)

  • thin (save every n iterations from theta samples)

  • thin_inf_hist (save every n iterations from infection history samples)

  • proposal_inf_hist_indiv_prop (proportion of individuals resampled at each infection history proposal)

  • save_block (after this many iterations (post thinning), save to disk)

  • target_acceptance_rate_theta (desired acceptance rate for theta parameters)

  • target_acceptance_rate_inf_hist (desired acceptance rate for infection histories)

  • proposal_ratio (ratio of infection history samples to theta samples ie. proposal_ratio = 2 means sample inf hist twice for every theta sample, proposal_ratio = 0.5 means sample theta twice for every inf hist sample)

  • proposal_inf_hist_time_prop (proportion of infection times to resample for each individual at each iteration)

  • proposal_inf_hist_distance (number of infection years/months to move when performing infection history swap step)

  • proposal_inf_hist_adaptive (if 1, performs adaptive infection history proposals. If 0, retains the starting infection history proposal parameters. This should ONLY be turned on for prior version 2)

  • proposal_inf_hist_indiv_swap_ratio (if using gibbs sampling of infection histories, what proportion of proposals should be swap steps, between 0 and 1)

  • proposal_inf_hist_group_swap_ratio (proportion of infection history proposal steps to swap proposal_inf_hist_group_swap_prop of two time periods' contents, between 0 and 1)

  • proposal_inf_hist_group_swap_prop (when swapping contents of two time points, what proportion of individuals should have their contents swapped, between 0 and 1)

  • propose_from_prior (set to 1 to sample directly from the infection history prior, or 0 for independent proposals. Sometimes one version works better than the other, so try switching if you are getting poor infection history convergence)

Value

A list with: 1) relative file path at which the MCMC chain is saved as a .csv file; 2) relative file path at which the infection history chain is saved as a .csv file; 3) the last used covariance matrix if mvr_pars != NULL; 4) the last used scale/step size (if multivariate proposals) or vector of step sizes (if univariate proposals)

See Also

https://github.com/jameshay218/lazymcmc

Other mcmc: generate_start_tab(), rm_scale(), save_infection_history_to_disk(), scaletuning()

Examples

## Not run: 
data(example_antibody_data)
data(example_par_tab)
data(example_antigenic_map)
res <- serosolver(example_par_tab[example_par_tab$names != "phi",], example_antibody_data, example_antigenic_map, prior_version=2)

## End(Not run)

adamkucharski/serosolver documentation built on April 28, 2024, 6:19 p.m.