bal_binary_GBF_BLR: (Balanced Binary) D&C Generalised Bayesian Fusion using SMC
In rchan26/hierarchicalFusion: Divide-and-Conquer Monte Carlo Fusion

bal_binary_GBF_BLR

R Documentation

(Balanced Binary) D&C Generalised Bayesian Fusion using SMC

Description

(Balanced Binary) D&C Generalised Bayesian Fusion using SMC for Bayesian Logistic Regression

Usage

bal_binary_GBF_BLR(
  N_schedule,
  m_schedule,
  time_mesh = NULL,
  base_samples,
  L,
  dim,
  data_split,
  prior_means,
  prior_variances,
  C,
  precondition = TRUE,
  resampling_method = "multi",
  ESS_threshold = 0.5,
  cv_location = "hypercube_centre",
  adaptive_mesh = FALSE,
  mesh_parameters = NULL,
  record = FALSE,
  diffusion_estimator = "Poisson",
  beta_NB = 10,
  gamma_NB_n_points = 2,
  local_bounds = TRUE,
  seed = NULL,
  n_cores = parallel::detectCores(),
  print_progress_iters = 1000
)

Arguments

`N_schedule`	vector of length (L-1), where N_schedule[l] is the number of samples per node at level l
`m_schedule`	vector of length (L-1), where m_schedule[k] is the number of samples to fuse for level k
`time_mesh`	time mesh used in Bayesian Fusion. This can either be a vector which will be used for each node in the tree, or it can be passed in as NULL, where a recommended mesh will be generated using the parameters passed into mesh_parameters
`base_samples`	list of length C, where base_samples[[c]] contains the samples for the c-th node in the level
`L`	total number of levels in the hierarchy
`dim`	dimension of the predictors (= p+1)
`data_split`	list of length m where each item is a list of length 4 where for c=1,...,m, data_split[[c]]$y is the vector for y responses and data_split[[c]]$X is the design matrix for the covariates for sub-posterior c, data_split[[c]]$full_data_count is the unique rows of the full data set with their counts and data_split[[c]]$design_count is the unique rows of the design matrix and their counts
`prior_means`	prior for means of predictors
`prior_variances`	prior for variances of predictors
`C`	number of sub-posteriors at the base level
`precondition`	either a logical value to determine if preconditioning matrices are used (TRUE - and is set to be the variance of the sub-posterior samples) or not (FALSE - and is set to be the identity matrix for all sub-posteriors), or a list of length (1/start_beta) where precondition[[c]] is the preconditioning matrix for sub-posterior c. Default is TRUE
`resampling_method`	method to be used in resampling, default is multinomial resampling ('multi'). Other choices are stratified ('strat'), systematic ('system'), residual ('resid')
`ESS_threshold`	number between 0 and 1 defining the proportion of the number of samples that ESS needs to be lower than for resampling (i.e. resampling is carried out only when ESS < N*ESS_threshold)
`cv_location`	string to determine what the location of the control variate should be. Must be either 'mode' where the MLE estimator will be used or 'hypercube_centre' (default) to use the centre of the simulated hypercube
`adaptive_mesh`	logical value to indicate if an adaptive mesh is used (default is FALSE)
`mesh_parameters`	list of parameters used for mesh
`record`	logical value indicating if variables such as E[nu_j], chosen, mesh_terms and k4_choice should be recorded at each iteration and returned (see return variables for this function) - default is FALSE
`diffusion_estimator`	choice of unbiased estimator for the Exact Algorithm between "Poisson" (default) for Poisson estimator and "NB" for Negative Binomial estimator
`beta_NB`	beta parameter for Negative Binomial estimator (default 10)
`gamma_NB_n_points`	number of points used in the trapezoidal estimation of the integral found in the mean of the negative binomial estimator (default is 2)
`local_bounds`	logical value indicating if local bounds for the phi function are used (default is TRUE)
`seed`	seed number - default is NULL, meaning there is no seed
`n_cores`	number of cores to use
`print_progress_iters`	number of iterations between each progress update (default is 1000). If NULL, progress will only be updated when importance sampling is finished

Value

A list with components:

particles: list of length (L-1), where particles[[l]][[i]] are the particles for level l, node i
proposed_samples: list of length (L-1), where proposed_samples[[l]][[i]] are the proposed samples for level l, node i
time: list of length (L-1), where time[[l]][[i]] is the run time for level l, node i
elapsed_time: list of length (L-1), where elapsed_time[[l]][[i]] is the elapsed time of each step of the algorithm for level l, node i
time_mesh: list of length (L-1), where time_mesh[[l]][[i]] is the time_mesh used for level l, node i
ESS: list of length (L-1), where ESS[[l]][[i]] is the effective sample size of the particles after each step BEFORE deciding whether or not to resample for level l, node i
CESS: list of length (L-1), where ESS[[l]][[i]] is the conditional effective sample size of the particles after each step
resampled: list of length (L-1), where resampled[[l]][[i]] is a boolean value to record if the particles were resampled after each step; rho and Q for level l, node i
precondition_matrices: pre-conditioning matrices that were used
sub_posterior_means: sub-posterior means that were used
recommended_mesh: list of length (L-1), where recommended_mesh[[l]][[i]] is the recommended mesh for level l, node i
data_inputs: list of length (L-1), where data_inputs[[l]][[i]] is the data input for the sub-posterior in level l, node i

If record is set to TRUE, additional components will be returned:

E_nu_j: list of length (L-1), where E_nu_j[[l]][[i]] is the approximation of the average variation of the trajectories at each time step for level l, node i
chosen: list of length (L-1), where chosen[[l]][[i]] indicates which term was chosen if using an adaptive mesh at each time step for level l, node i
mesh_terms: list of length (L-1), where mesh_terms[[l]][[i]] indicates the evaluated terms in deciding the mesh at each time step for level l, node i
k4_choice: list of length (L-1), where k4_choice[[l]][[i]]] indicates which of the roots of k4 were chosen at each time step for level l, node i