bootstrap_model: Computes bootstrap resamples of your data, stores estimates +...

Description Usage Arguments Value Examples

View source: R/bootstrap_model.R

Description

By default, this will compute bootstrap resamples and then send them to bootstrap_ci for calculation.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
bootstrap_model(
  base_model,
  base_data,
  resamples = 9999,
  return_coefs_instead = FALSE,
  parallelism = c("none", "future", "parallel"),
  resample_specific_blocks = NULL,
  unique_resample_lim = NULL,
  narrowness_avoid = TRUE,
  num_cores = NULL,
  future_packages = NULL,
  suppress_sampling_message = !interactive()
)

Arguments

base_model

The pre-bootstrap model, i.e. the model output from running a standard model call. Examples: base_model <- glmmTMB(y ~ age + (1 | subj), data = rel_data, family = binomial) base_model <- lm(y ~ x, data = xy_frame)

base_data

The data that was used in the call. You can leave this to be automatically read, but I highly recommend supplying it

resamples

How many resamples of your data do you want to do? 9,999 is a reasonable default (see Hesterberg 2015), but start very small to make sure it works on your data properly, and to get a rough timing estimate etc.

return_coefs_instead

Logical, default FALSE: do you want the list of lists of results for each bootstrap sample (set to TRUE), or the matrix output of all samples? See return for more details.

parallelism

Type of parallelism (if any) to use to run the resamples. Options are:

"none"

The default, sequential

"future"

To use future.apply (futures)

"parallel"

To use parallel::mclapply

resample_specific_blocks

Character vector, default NULL. If left NULL, this algorithm with choose ONE random block to resample over - the one with the largest entropy (often the one with most levels). If you wish to resample over specific random effects as blocks, enter the names here - can be one, or many. Note that resampling multiple blocks is in general quite conservative. If you want to perform case resampling but you DO have random effects, set resample_specific_blocks to any non-null value that isn't equal to a random effect variable name.

unique_resample_lim

Should be same length as number of random effects (or left NULL). Do you want to force the resampling to produce a minimum number of unique values in sampling? Don't make this too big. Must be named same as rand cols

narrowness_avoid

Boolean, default TRUE. If TRUE, will resample n-1 instead of n elements in the bootstrap (n being either rows, or random effect levels, depending on existence of random effects). If FALSE, will do typical size n resampling.

num_cores

How many cores to use. Defaults to parallel::detectCores() - 1L if parallelism = "parallel"

future_packages

Packages to pass to created futures when using parallelism = "future". This must be supplied if the package used to model the data isn't in base and you're using a plan that doesn't have shared memory, because the model is updated with the S3 generic update.

suppress_sampling_message

Logical, the default is to suppress if not in an interactive session. Do you want the function to message the console with the type of bootstrapping? If block resampling over random effects, then it'll say what effect it's sampling over; if case resampling - in which case it'll say as much. Set TRUE to hide message.

Value

By default (with return_coefs_instead being FALSE), returns the output from bootstrap_ci; for each set of covariates (usually just the one set, the conditional model) we get a matrix of output: a row for each variable (including the intercept), estimate, CIs for boot and base, p-values. If return_coefs_instead is TRUE, then will instead return a list of length two:

This output is useful for error checking, and if you want to run this function in certain distributed ways.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
x <- rnorm(20)
y <- rnorm(20) + x
xy_data = data.frame(x = x, y = y)
first_model <- lm(y ~ x, data = xy_data)

out_matrix <- bootstrap_model(first_model, base_data = xy_data, 20)
out_list <- bootstrap_model(first_model,
                            base_data = xy_data,
                            resamples = 20,
                            return_coefs_instead = TRUE)


  data(test_data)
  library(glmmTMB)
  test_formula <- as.formula('y ~ x_var1 + x_var2 + x_var3 + (1|subj)')
  test_model <- glmmTMB(test_formula, data = test_data, family = binomial)
  output_matrix <- bootstrap_model(test_model, base_data = test_data, 199)

  output_lists <- bootstrap_model(test_model,
                                  base_data = test_data,
                                  resamples = 199,
                                  return_coefs_instead = TRUE)

glmmboot documentation built on June 28, 2021, 1:05 a.m.