| lm_b | R Documentation |
lm_b is used to fit linear models. It can be used to carry out regression, single stratum analysis of variance and analysis of covariance (although aov_b may provide a more convenient interface for ANOVA.)
lm_b(
formula,
data,
weights,
prior = c("zellner", "conjugate", "improper")[1],
zellner_g,
prior_beta_mean,
prior_beta_precision,
prior_var_shape,
prior_var_rate,
ROPE,
CI_level = 0.95
)
formula |
A formula specifying the model. |
data |
A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way. However, it is strongly recommended that you use this argument so that other generics for bayesics objects work correctly. |
weights |
an optional vector of weights to be used in the fitting process.
Should be NULL or a numeric vector. If non-NULL, it is assumed that the
variance of |
prior |
character. One of "zellner", "conjugate", or "improper", giving the type of prior used on the regression coefficients. |
zellner_g |
numeric. Positive number giving the value of "g" in Zellner's g prior. Ignored unless prior = "zellner". |
prior_beta_mean |
numeric vector of same length as regression coefficients (denoted p). Unless otherwise specified, automatically set to rep(0,p). Ignored unless prior = "conjugate". |
prior_beta_precision |
pxp matrix giving a priori precision matrix to be scaled by the residual precision. |
prior_var_shape |
numeric. Twice the shape parameter for the inverse gamma prior on
the residual variance(s). I.e., |
prior_var_rate |
numeric. Twice the rate parameter for the inverse gamma prior on
the residual variance(s). I.e., |
ROPE |
vector of positive values giving ROPE boundaries for each regression coefficient. Optionally, you can not include a ROPE boundary for the intercept. If missing, defaults go to those suggested by Kruchke (2018). |
CI_level |
numeric. Credible interval level. |
MODEL:
The likelihood is given by
y_i \overset{ind}{\sim} N(x_i'\beta,\sigma^2).
The prior is given by
\beta|\sigma^2 \sim N\left( \mu, \sigma^2 V^{-1} \right) \\
\sigma^2 \sim \Gamma^{-1}(a/2,b/2).
For Zellner's g prior, \frac{1}{g}X'X.
The default for the conjugate prior is based on arguments from
standardized regression. The default V is set according to the following: "a priori,
we are 95% certain that a standard deviation increase in X will not lead
to more than a 5 standard deviation in the mean of y." This leads to
V^{1/2} = diag(v^{1/2},s_{x_1},\ldots,s_{x_p}) / 2.5,
v^{-1} := \sum \frac{\bar x_j^2}{s_j^2}
where s_y is the standard deviation of y, and s_{x_j} is
the standard deviation of the j^{th} covariate.
Unless prior_var_shape AND prior_var_rate are provided,
the inverse gamma prior on the residual variance will place 50% prior
probability that the correlation between the response and the fitted values
is between 0.1 and 0.9.
If prior = "improper", then the prior is
\pi(\beta,\sigma^2) \propto \frac{1}{\sigma^2}.
ROPE:
If missing, the ROPE bounds will be given under the principle of "half of a
small effect size." Using Cohen's D of 0.2 as a small effect size, the ROPE is
built under the principle that moving the full range of X (i.e., \pm 2 s_x)
will not move the mean of y by more than the overall mean of y
minus 0.1s_y to the overall mean of y plus 0.1s_y.
The result is a ROPE equal to |\beta_j| < 0.05s_y/s_j. If the covariate is
binary, then this is simply |\beta_j| < 0.2s_y.
lm_b() returns an object of class "lm_b", which behaves as a list with the following elements:
summary - tibble giving results for regression coefficients
posterior_parameters - list giving the posterior parameters
hyperparameters - list giving the user input (or default) hyperparameters used
fitted - posterior mean of the individuals' means
residuals - posterior mean of the residuals
formula, data - input by user
# Generate some data
set.seed(2025)
N = 500
test_data =
data.frame(x1 = rnorm(N),
x2 = rnorm(N),
x3 = letters[1:5])
test_data$outcome =
rnorm(N,-1 + test_data$x1 + 2 * (test_data$x3 %in% c("d","e")) )
# Find good hyperparameters for the residual variance
s2_hyperparameters =
find_invgamma_parms(lower_R2 = 0.05,
upper_R2 = 0.95,
probability = 0.8,
response_variance = var(test_data$outcome))
## Check (should equal ~ 0.8)
extraDistr::pinvgamma((1.0 - 0.05) * var(test_data$outcome),
0.5 * s2_hyperparameters[1],
0.5 * s2_hyperparameters[2]) -
extraDistr::pinvgamma((1.0 - 0.95) * var(test_data$outcome),
0.5 * s2_hyperparameters[1],
0.5 * s2_hyperparameters[2])
# Fit the linear model using a conjugate prior
fit1 <-
lm_b(outcome ~ x1 + x2 + x3,
data = test_data,
prior = "conj",
prior_var_shape = s2_hyperparameters["shape"],
prior_var_rate = s2_hyperparameters["rate"])
fit1
summary(fit1,
CI_level = 0.99)
plot(fit1)
coef(fit1)
credint(fit1,
CI_level = 0.9)
bayes_factors(fit1)
bayes_factors(fit1,
by = "var")
AIC(fit1)
BIC(fit1)
DIC(fit1)
WAIC(fit1)
vcov(fit1)
preds = predict(fit1)
# Try other priors
fit2 <-
lm_b(outcome ~ x1 + x2 + x3,
data = test_data) # Default is prior = "zellner"
fit3 <-
lm_b(outcome ~ x1 + x2 + x3,
data = test_data,
prior = "improper")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.