apollo_bootstrap: Bootstrap a model
In apollo: Tools for Choice Model Estimation and Application

apollo_bootstrap

R Documentation

Bootstrap a model

Description

Samples individuals with replacement from the database, and estimates the model for each sample.

Usage

apollo_bootstrap(
  apollo_beta,
  apollo_fixed,
  apollo_probabilities,
  apollo_inputs,
  estimate_settings = list(estimationRoutine = "bgw", maxIterations = 200, writeIter =
    FALSE, hessianRoutine = "none", printLevel = 2L, silent = FALSE, maxLik_settings =
    list()),
  bootstrap_settings = list(nRep = 30, samples = NA, calledByEstimate = FALSE, recycle =
    TRUE)
)

Arguments

`apollo_beta`	Named numeric vector. Names and values for parameters.
`apollo_fixed`	Character vector. Names (as defined in `apollo_beta`) of parameters whose value should not change during estimation.
`apollo_probabilities`	Function. Returns probabilities of the model to be estimated. Must receive three arguments: `apollo_beta`: Named numeric vector. Names and values of model parameters. `apollo_inputs`: List containing options of the model. See apollo_validateInputs. `functionality`: Character. Can be either `"components"`, `"conditionals"`, `"estimate"` (default), `"gradient"`, `"output"`, `"prediction"`, `"preprocess"`, `"raw"`, `"report"`, `"shares_LL"`, `"validate"` or `"zero_LL"`.
`apollo_inputs`	List grouping most common inputs. Created by function apollo_validateInputs.
`estimate_settings`	List. Options controlling the estimation process. See apollo_estimate. `hessianRoutine="none"` by default.
`bootstrap_settings`	List containing settings for the sampling procedure. User input is required for all settings except those with a default or marked as optional. calledByEstimate: Logical. TRUE if `apollo_bootstrap` is called by apollo_estimate. FALSE by default. nRep: Numeric scalar. Number of times the model must be estimated with different samples. Default is 30. recycle: Logical. If TRUE, the function will look for old output files and append new repetitions to them. If FALSE, output files will be overwritten. samples: Numeric matrix or data.frame. Optional argument. Must have as many rows as observations in the `database`, and as many columns as number of repetitions wanted. Each column represents a re-sample, and each element the number of times that observation must be included in the sample. If this argument is provided, then `nRep` is ignored. Note that this allows sampling at the observation rather than the individual level, which is not recommended for panel data. seed: DEPRECATED, `apollo_control$seed` is used since v0.2.5. Numeric scalar (integer). Random number generator seed to generate the bootstrap samples. Only used if `samples` is `NA`. Default is 24.

Details

This function implements a basic block bootstrap. It estimates the model parameters on nRep different samples. Each new sample is constructed by sampling with replacement from the original full sample. Each new sample has as many individuals as the original sample, though some of them may be repeated. Sampling is done at the individual level, therefore if different individuals have different number of observations, each re-sample does not necessarily have the same number of observations.

If the sampling should be done at the individual level (not recommended for panel data), then the optional bootstrap_settings$samples argument should be provided.

For each sample, only the parameters and log-likelihood are estimated. Standard errors are not calculated (they may be added in future versions). The composition of the re-samples is stored in a file, but is stable with the same seed.

This function writes three different files to the working or output directory:

modelName_bootstrap_params.csv: estimated parameters, final log-likelihood, and number of observations for each re-sample
modelName_bootstrap_samples.csv: composition of each re-sample.
modelName_bootstrap_vcov.csv: variance-covariance matrix of the estimated parameters across re-samples.

The first two files are updated throughout the run of this function, while the last one is only written once the function finishes.

When run, this function will look for the first two files above in the working/output directory. If they are found, the function will attempt to pick up re-sampling from where those files left off. This is useful in cases where the original bootstrapping was interrupted, or when additional re-sampling runs are to be performed.

Value

List with three elements.

estimates: Matrix containing the parameter estimates for each repetition. As many rows as repetitions and as many columns as parameters.
LL: Vector of final log-likelihoods of each repetition.
varcov: Covariance matrix of the estimated parameters across the repetitions.

This function also writes three output files to the working/output directory, with the following names ('x' represents the model name):

x_bootstrap_params.csv: Table containing the parameter estimates, log-likelihood, and number of observations for each repetition.
x_bootstrap_samples.csv: Table containing the description of the sample used in each repetition. Same format than bootstrap_settings$samples.
x_bootstrap_vcov: Table containing the covariance matrix of estimated parameters across the repetitions.