bal.boot: Implementation of the balanced bootstrap for one or two...
In JackAHutchings/jahrfun: JAH Miscellaneous R Functions

bal.boot

R Documentation

Implementation of the balanced bootstrap for one or two samples

Description

This function performs a balanced bootstrap on one or two samples If only one sample is provided, its bootstrapped distribution is compared against a defined value (null) to test for significance. If two samples are provided, then the first sample's summary statistic is subtracted from the second sample's sampling statistic and the difference is compared against the defined value (null) to test for signifiance.

Usage

bal.boot(
  a,
  b = NA,
  asd = NA,
  bsd = NA,
  n = 10000,
  ci.width = 95,
  null = 0,
  stat.function = mean,
  paired = F
)

Arguments

`a`	First dataset, a numerical vector.
`b`	Second dataset, a numerical vector. If paired=TRUE, then length must be equal to a.
`asd`	(optional) First dataset's error as 1 standard deviation. If used, this must either be a single number (if error is uniform) or a vector with length equal to a.
`bsd`	(optional) Second dataset's error as 1 standard deviation. If used, this must either be a single number (if error is uniform) or a vector with length equal to a.
`n`	Number of bootstrap replicates to perform.
`ci.width`	Width of the confidence interval to use for hypothesis testing, a single numeric value between 1 and 100
`null`	Value representing the null hypothesis, default is 0.
`stat.function`	Sampling statistic to use. Can use any function that takes a single vector and reports a single numerical value as its result. Default is the mean.
`paired`	(optional) Boolean to indicate if the data are paired. Only relevant for two-sample situations.

Details

By default, this is a simple balanced bootstrap where the sample is replicates n times, shuffled, and then subset into groups equal to the original sample size. However, if errors (asd or bsd) for either group are provided, then each bootstrap replicate for a value with an associated error is sampled from a normal distribution whose mean is the observed value and standard deviation is the user-submitted error. Thus, if errors are provided, this is more of a balanced Monte Carlo and carries an additional assumption that the true distribution surrounding individual values is normal.

The 'balanced' notion used here is taken from "Efficient bootstrap simulation" by Davison et al. 1986 Biometrika, Volume 73, Issue 3, December 1986, Pages 555–566, https://doi.org/10.1093/biomet/73.3.555

The 'difference' tested is b minus a, thus the directionality is such that positive differences indicate b > a and negative differences indicate b < a.

Output

This function returns a named list as the result:

tidy.data - Summary statistics of the bootstrapped distribution
data - The complete bootstrap distribution(s) summarized at the replicate level using stat.function
parameters - The input parameters of the function
stat.function - The actual function used as the sampling statistic
input - The input data