| np_glm_b | R Documentation |
np_glm_b uses general Bayesian inference with loss-likelihood bootstrap. This is, as implemented here, a Bayesian non-parametric linear models inferential engine. Applicable data types are continuous (use family = gaussian()), count (use family = poisson()), or binomial (use family = binomial()).
np_glm_b(
formula,
data,
family,
loss = "selfinformation",
loss_gradient,
trials,
n_draws,
ask_before_full_sampling = TRUE,
CI_level = 0.95,
ROPE,
seed = 1,
mc_error = 0.01
)
formula |
A formula specifying the model. |
data |
A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way. However, it is strongly recommended that you use this argument so that other generics for bayesics objects work correctly. |
family |
A description of the error distribution and link function
to be used in the model. See |
loss |
Either "selfinformation",
or a function that takes in two arguments, the first of which should
be the vector of outcomes and the second should be the expected value of y;
The outcome of the function should be the loss evaluated for each observation.
By default, the self-information loss is used (i.e., the negative log-likelihood).
Note: I really do mean the expected value of y, even for binomial (i.e., n*p).
If |
loss_gradient |
If loss is a user-defined function (as opposed to "selfinformation"), supplying the gradient to the loss will speed up the algorithm. |
trials |
Integer vector giving the number of trials for each observation if family = binomial(). |
n_draws |
integer. Number of posterior draws to obtain. If left missing, the large sample approximation will be used. |
ask_before_full_sampling |
logical. If TRUE, the user will be asked to specify whether they wish to commit to getting the full number of posterior draws to obtain precise credible interval bounds. Defaults to TRUE because the bootstrap is computationally intensive. Also, parallelization via future::plan is highly recommended for full sample. |
CI_level |
numeric. Credible interval level. |
ROPE |
vector of positive values giving ROPE boundaries for each regression coefficient. Optionally, you can not include a ROPE boundary for the intercept. If missing, defaults go to those suggested by Kruchke (2018). |
seed |
integer. Always set your seed!!! |
mc_error |
If large sample approximation is not used, the number of
posterior draws will ensure that with 99% probability the bounds of the
credible intervals will be within |
Consider a population parameter of interest defined in terms of
minimizing a loss function \ell wrt the population distribution:
\theta(F_y) := \underset{\theta\in\Theta}{\text{argmax}} \int \ell(\theta,y)dF_y
If we use a non-parametric Dirichlet process prior on the distribution
of y, F_y, and let the concentration parameter go to zero, we
have the Bayesian bootstrap applied to a general Bayesian updating framework
dictated by the loss function.
By default, the loss function is the self-information loss, i.e., the negative
log likelihood. This then resembles a typical glm_b implementation,
but is more robust to model misspecification.
np_glm_b() returns an object of class "np_glm_b", which behaves as a list with the following elements:
summary - a tibble giving results for regression coefficients.
S P Lyddon, C C Holmes, S G Walker, General Bayesian updating and the loss-likelihood bootstrap, Biometrika, Volume 106, Issue 2, June 2019, Pages 465–478, https://doi.org/10.1093/biomet/asz006
# Generate some data
set.seed(2025)
N = 500
test_data =
data.frame(x1 = rnorm(N),
x2 = rnorm(N),
x3 = letters[1:5])
test_data$outcome =
rbinom(N,1,1.0 / (1.0 + exp(-(-2 + test_data$x1 + 2 * (test_data$x3 %in% c("d","e")) ))))
# Fit the GLM via the (non-parametric) loss-likelihood bootstrap.
fit1 <-
np_glm_b(outcome ~ x1 + x2 + x3,
data = test_data,
family = binomial())
fit1
summary(fit1,
CI_level = 0.99)
plot(fit1)
coef(fit1)
credint(fit1)
predict(fit1,
newdata = fit1$data[1,])
vcov(fit1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.