blb | R Documentation |
Bag of little bootstrap as described by Kleiner et al. 2014 implemented with adaptive convergence checking.
blb(
data,
subset_size_b = nrow(data)^0.7,
n_subsets = NA,
n_resamples = 100,
window_subsets = 3,
window_resamples = 20,
epsilon = 0.05,
fun_estimator = NULL,
fun_metric = NULL
)
data |
A two-dimensional numerical data object. |
subset_size_b |
An integer value. The number of rows for each subset
bootstraps. Kleiner et al. 2014 suggest empirically a value of
|
n_subsets |
An integer value. The upper limit of sampled subsets
s. If |
n_resamples |
An integer value. The upper limit of Monte-Carlo iterations (resamples, r) carried out on each subset. Kleiner et al. 2014 found empirically that a value of r = 100 worked well for confidence intervals. If convergence is achieved earlier, then not all r resamples are processed. |
window_subsets |
An integer value. The window size of the number of previous subsets to consider for adaptive convergence checking. |
window_resamples |
An integer value. The window size of the number of previous resamples to consider for adaptive convergence checking. |
epsilon |
A positive numerical value. The acceptable relative error to determine convergence. |
fun_estimator |
A function with two arguments
x and weights where
x will be j-th subset of |
fun_metric |
A function with one argument x which will be
applied to each element of the results of |
A two-dimensional object of means across all subsets
where rows represent the estimator(s) of quality assessment,
i.e., the output of fun_metric
, and the columns represent the
the estimator(s) of interest, i.e., the output of fun_estimator
.
Kleiner, A., A. Talwalkar, P. Sarkar, and M. I. Jordan. 2014. A scalable bootstrap for massive data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76:795–816.
Function drBLB
of package
https://github.com/delta-rho/datadr.
n <- 10000
xt <- seq(0, 10, length.out = n)
ex_data <- data.frame(
x1 = sample(xt),
x2 = sample(xt)
)
# Linear regression with coefficients 1, 2, and 3
ex_data[, "y"] <-
1 + rnorm(n, 0, 1) + 2 * ex_data[, "x1"] + 3 * ex_data[, "x2"]
# Estimate coefficients with BLB
blb(
data = ex_data,
fun_estimator = function(x, weights) {
coef(lm(
y ~ x1 + x2,
data = x,
weights = weights / max(weights)
))
},
fun_metric = function(x) {
quantile(x, probs = c(0.025, 0.5, 0.975))
}
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.