knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
p-values can be computed by inverting the corresponding confidence intervals, as described in Section 14.2 of Thulin (2024) and Section 3.12 in Hall (1992). This package contains functions for computing bootstrap p-values in this way. The approach relies on the fact that:
Summary tables with confidence intervals and p-values for the coefficients of regression models can be obtained using the boot_summary
(most models) and censboot_summary
(models with censored response variables) functions. Currently, the following models are supported:
lm
,glm
or glm.nb
,nls
,MASS::rlm
,MASS:polr
,lme4::lmer
or lmerTest::lmer
,lme4::glmer
.survival::coxph
(using censboot_summary
).survival::survreg
or rms::psm
(using censboot_summary
).residuals(object, type="pearson")
returns Pearson residuals; fitted(object)
returns fitted values; hatvalues(object)
returns the leverages, or perhaps the value 1 which will effectively ignore setting the hatvalues. In addition, the data
argument should contain no missing values among the columns actually used in fitting the model.A number of examples are available in Chapters 8 and 9 of Modern Statistics with R.
library(boot.pval)
Here are some simple examples with a linear regression model for the mtcars
data:
# Bootstrap summary of a linear model for mtcars: model <- lm(mpg ~ hp + vs, data = mtcars) boot_summary(model) # Use 9999 bootstrap replicates and adjust p-values for # multiplicity using Holm's method: boot_summary(model, R = 9999, adjust.method = "holm") # Use case resampling instead of residual resampling: boot_summary(model, method = "case") # Export results to a gt table: boot_summary(model, R = 9999) |> summary_to_gt()
See Davison and Hinkley (1997) for details about residual resampling (the default) and case resampling.
# Export results to a Word document: library(flextable) boot_summary(model, R = 9999) |> summary_to_flextable() |> save_as_docx(path = "my_table.docx")
And a toy example for a generalised linear mixed model (using a small number of bootstrap repetitions):
library(lme4) model <- glmer(TICKS ~ YEAR + (1|LOCATION), data = grouseticks, family = poisson) boot_summary(model, R = 99)
For complex models, speed can be greatly improved by using parallelisation. For lmer
and glmer
models, this is set using the parallel
(available options are "multicore"
and "snow"
). The number of CPUs to use is set using ncpus
.
model <- glmer(TICKS ~ YEAR + (1|LOCATION), data = grouseticks, family = poisson) boot_summary(model, R = 999, parallel = "multicore", ncpus = 10)
For other models, use ncores
:
model <- lm(mpg ~ hp + vs, data = mtcars) boot_summary(model, R = 9999, ncores = 10)
Survival regression models should be fitted using the argument model = TRUE
. A summary table can then be obtained using censboot_summary
. By default, the table contains exponentiated coefficients (i.e. hazard ratios, in the case of a Cox PH model).
library(survival) # Weibull AFT model: model <- survreg(formula = Surv(time, status) ~ age + sex, data = lung, dist = "weibull", model = TRUE) # Table with exponentiated coefficients: censboot_summary(model) # Cox PH model: model <- coxph(formula = Surv(time, status) ~ age + sex, data = lung, model = TRUE) # Table with hazard ratios: censboot_summary(model) # Table with original coefficients: censboot_summary(model, coef = "raw")
To speed up computations using parallelisation, use the parallel
and ncpus
arguments:
censboot_summary(model, parallel = "multicore", ncpus = 10)
Bootstrap p-values for hypothesis tests based on boot
objects can be obtained using the boot.pval
function. The following examples are extensions of those given in the documentation for boot::boot
:
# Hypothesis test for the city data # H0: ratio = 1.4 library(boot) ratio <- function(d, w) sum(d$x * w)/sum(d$u * w) city.boot <- boot(city, ratio, R = 999, stype = "w", sim = "ordinary") boot.pval(city.boot, theta_null = 1.4) # Studentized test for the two sample difference of means problem # using the final two series of the gravity data. diff.means <- function(d, f) { n <- nrow(d) gp1 <- 1:table(as.numeric(d$series))[1] m1 <- sum(d[gp1,1] * f[gp1])/sum(f[gp1]) m2 <- sum(d[-gp1,1] * f[-gp1])/sum(f[-gp1]) ss1 <- sum(d[gp1,1]^2 * f[gp1]) - (m1 * m1 * sum(f[gp1])) ss2 <- sum(d[-gp1,1]^2 * f[-gp1]) - (m2 * m2 * sum(f[-gp1])) c(m1 - m2, (ss1 + ss2)/(sum(f) - 2)) } grav1 <- gravity[as.numeric(gravity[,2]) >= 7, ] grav1.boot <- boot(grav1, diff.means, R = 999, stype = "f", strata = grav1[ ,2]) boot.pval(grav1.boot, type = "stud", theta_null = 0)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.