int_pctl | R Documentation |
Calculate bootstrap confidence intervals using various methods.
int_pctl(.data, ...)
## S3 method for class 'bootstraps'
int_pctl(.data, statistics, alpha = 0.05, ...)
int_t(.data, ...)
## S3 method for class 'bootstraps'
int_t(.data, statistics, alpha = 0.05, ...)
int_bca(.data, ...)
## S3 method for class 'bootstraps'
int_bca(.data, statistics, alpha = 0.05, .fn, ...)
.data |
A data frame containing the bootstrap resamples created using
|
... |
Arguments to pass to |
statistics |
An unquoted column name or |
alpha |
Level of significance. |
.fn |
A function to calculate statistic of interest. The
function should take an |
Percentile intervals are the standard method of obtaining confidence intervals but require thousands of resamples to be accurate. T-intervals may need fewer resamples but require a corresponding variance estimate. Bias-corrected and accelerated intervals require the original function that was used to create the statistics of interest and are computationally taxing.
Each function returns a tibble with columns .lower
,
.estimate
, .upper
, .alpha
, .method
, and term
.
.method
is the type of interval (eg. "percentile",
"student-t", or "BCa"). term
is the name of the estimate. Note
the .estimate
returned from int_pctl()
is the mean of the estimates from the bootstrap resamples
and not the estimate from the apparent model.
https://rsample.tidymodels.org/articles/Applications/Intervals.html
Davison, A., & Hinkley, D. (1997). Bootstrap Methods and their Application. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511802843
reg_intervals()
library(broom)
library(dplyr)
library(purrr)
library(tibble)
library(tidyr)
# ------------------------------------------------------------------------------
lm_est <- function(split, ...) {
lm(mpg ~ disp + hp, data = analysis(split)) %>%
tidy()
}
set.seed(52156)
car_rs <-
bootstraps(mtcars, 500, apparent = TRUE) %>%
mutate(results = map(splits, lm_est))
int_pctl(car_rs, results)
int_t(car_rs, results)
int_bca(car_rs, results, .fn = lm_est)
# ------------------------------------------------------------------------------
# putting results into a tidy format
rank_corr <- function(split) {
dat <- analysis(split)
tibble(
term = "corr",
estimate = cor(dat$sqft, dat$price, method = "spearman"),
# don't know the analytical std.error so no t-intervals
std.error = NA_real_
)
}
set.seed(69325)
data(Sacramento, package = "modeldata")
bootstraps(Sacramento, 1000, apparent = TRUE) %>%
mutate(correlations = map(splits, rank_corr)) %>%
int_pctl(correlations)
# ------------------------------------------------------------------------------
# An example of computing the interval for each value of a custom grouping
# factor (type of house in this example)
# Get regression estimates for each house type
lm_est <- function(split, ...) {
analysis(split) %>%
tidyr::nest(.by = c(type)) %>%
# Compute regression estimates for each house type
mutate(
betas = purrr::map(data, ~ lm(log10(price) ~ sqft, data = .x) %>% tidy())
) %>%
# Convert the column name to begin with a period
rename(.type = type) %>%
select(.type, betas) %>%
unnest(cols = betas)
}
set.seed(52156)
house_rs <-
bootstraps(Sacramento, 1000, apparent = TRUE) %>%
mutate(results = map(splits, lm_est))
int_pctl(house_rs, results)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.