np_glm_b: Non-parametric linear models
In bayesics: Bayesian Analyses for One- and Two-Sample Inference and Regression Methods

np_glm_b

R Documentation

Non-parametric linear models

Description

np_glm_b uses general Bayesian inference with loss-likelihood bootstrap. This is, as implemented here, a Bayesian non-parametric linear models inferential engine. Applicable data types are continuous (use family = gaussian()), count (use family = poisson()), or binomial (use family = binomial()).

Usage

np_glm_b(
  formula,
  data,
  family,
  loss = "selfinformation",
  loss_gradient,
  trials,
  n_draws,
  ask_before_full_sampling = TRUE,
  CI_level = 0.95,
  ROPE,
  seed = 1,
  mc_error = 0.01
)

Arguments

`formula`	A formula specifying the model.
`data`	A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way. However, it is strongly recommended that you use this argument so that other generics for bayesics objects work correctly.
`family`	A description of the error distribution and link function to be used in the model. See `?`glm for more information. Currently implemented families are `binomial()`, `poisson()`, `negbinom()`, and `gaussian()` (this last acts as a wrapper for
`loss`	Either "selfinformation", or a function that takes in two arguments, the first of which should be the vector of outcomes and the second should be the expected value of y; The outcome of the function should be the loss evaluated for each observation. By default, the self-information loss is used (i.e., the negative log-likelihood). Note: I really do mean the expected value of y, even for binomial (i.e., n*p). If `family = negbinom()`, then a user-supplied loss function should take three arguments: y, mu, and phi, where phi is the dispersion parameter (i.e., `\text{Var}(y) = \mu + \mu^2/\phi`).
`loss_gradient`	If loss is a user-defined function (as opposed to "selfinformation"), supplying the gradient to the loss will speed up the algorithm.
`trials`	Integer vector giving the number of trials for each observation if family = binomial().
`n_draws`	integer. Number of posterior draws to obtain. If left missing, the large sample approximation will be used.
`ask_before_full_sampling`	logical. If TRUE, the user will be asked to specify whether they wish to commit to getting the full number of posterior draws to obtain precise credible interval bounds. Defaults to TRUE because the bootstrap is computationally intensive. Also, parallelization via future::plan is highly recommended for full sample.
`CI_level`	numeric. Credible interval level.
`ROPE`	vector of positive values giving ROPE boundaries for each regression coefficient. Optionally, you can not include a ROPE boundary for the intercept. If missing, defaults go to those suggested by Kruchke (2018).
`seed`	integer. Always set your seed!!!
`mc_error`	If large sample approximation is not used, the number of posterior draws will ensure that with 99% probability the bounds of the credible intervals will be within `\pm` `mc_error`.

Details

Consider a population parameter of interest defined in terms of minimizing a loss function \ell wrt the population distribution:

\theta(F_y) := \underset{\theta\in\Theta}{\text{argmax}} \int \ell(\theta,y)dF_y

If we use a non-parametric Dirichlet process prior on the distribution of y, F_y, and let the concentration parameter go to zero, we have the Bayesian bootstrap applied to a general Bayesian updating framework dictated by the loss function.

By default, the loss function is the self-information loss, i.e., the negative log likelihood. This then resembles a typical glm_b implementation, but is more robust to model misspecification.

Value

np_glm_b() returns an object of class "np_glm_b", which behaves as a list with the following elements:

summary - a tibble giving results for regression coefficients.

References

S P Lyddon, C C Holmes, S G Walker, General Bayesian updating and the loss-likelihood bootstrap, Biometrika, Volume 106, Issue 2, June 2019, Pages 465–478, https://doi.org/10.1093/biomet/asz006

Examples


# Generate some data
set.seed(2025)
N = 500
test_data = 
  data.frame(x1 = rnorm(N),
             x2 = rnorm(N),
             x3 = letters[1:5])
test_data$outcome = 
  rbinom(N,1,1.0 / (1.0 + exp(-(-2 + test_data$x1 + 2 * (test_data$x3 %in% c("d","e")) ))))

# Fit the GLM via the (non-parametric) loss-likelihood bootstrap.
fit1 <-
  np_glm_b(outcome ~ x1 + x2 + x3,
           data = test_data,
           family = binomial())
fit1
summary(fit1,
        CI_level = 0.99)
plot(fit1)
coef(fit1)
credint(fit1)
predict(fit1,
        newdata = fit1$data[1,])
vcov(fit1)

bayesics documentation built on March 11, 2026, 5:07 p.m.

bayesics index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bayesics
Bayesian Analyses for One- and Two-Sample Inference and Regression Methods

np_glm_b: Non-parametric linear models
In bayesics: Bayesian Analyses for One- and Two-Sample Inference and Regression Methods

Non-parametric linear models

Description

Usage

Arguments

Details

Value

References

Examples

Related to np_glm_b in bayesics...

R Package Documentation

Browse R Packages

We want your feedback!

bayesics Bayesian Analyses for One- and Two-Sample Inference and Regression Methods

np_glm_b: Non-parametric linear models In bayesics: Bayesian Analyses for One- and Two-Sample Inference and Regression Methods

Non-parametric linear models

Description

Usage

Arguments

Details

Value

References

Examples

Related to np_glm_b in bayesics...

R Package Documentation

Browse R Packages

We want your feedback!

bayesics
Bayesian Analyses for One- and Two-Sample Inference and Regression Methods

np_glm_b: Non-parametric linear models
In bayesics: Bayesian Analyses for One- and Two-Sample Inference and Regression Methods