summary.bayesreg: Summarization method for Bayesian penalised regression...

View source: R/bayesreg.R

summary.bayesregR Documentation

Summarization method for Bayesian penalised regression (bayesreg) models

Description

summary method for Bayesian regression models fitted using bayesreg.

Usage

## S3 method for class 'bayesreg'
summary(
  object,
  sort.rank = FALSE,
  display.OR = FALSE,
  CI = 95,
  max.rows = NA,
  ...
)

Arguments

object

An object of class "bayesreg" created as a result of a call to bayesreg.

sort.rank

logical; if TRUE, the variables in the summary will be sorted by their importance as determined by their rank estimated by the Bayesian feature ranking algorithm.

display.OR

logical; if TRUE, the variables will be summarised in terms of their cross-sectional odds-ratios rather than their regression coefficients (logistic regression only).

CI

numerical; the level of the credible interval reported in summary. Default is 95 (i.e., 95% credible interval).

max.rows

numerical; the maximum number of rows (variables) to display in the summary. Default is to display all variables.

...

Further arguments passed to or from other methods.

Value

Returns an object with the following fields:

log.l

The log-likelihood of the model at the posterior mean estimates of the regression coefficients.

waic

The Widely Applicable Information Criterion (WAIC) score of the model.

waic.dof

The effective degrees-of-freedom of the model, as estimated by the WAIC.

r2

For non-binary data, the R^2 statistic.

sd.error

For non-binary data, the estimated standard deviation of the errors.

p.r2

For binary data, the pseudo-R^2 statistic.

mu.coef

The posterior means of the regression coefficients.

se.coef

The posterior standard deviations of the regression coefficients.

CI.coef

The posterior credible interval for the regression coefficients, at the level specified (default: 95%).

med.OR

For binary data, the posterior median of the cross-sectional odds-ratios.

se.OR

For binary data, the posterior standard deviation of the cross-sectional odds-ratios.

CI.OR

For binary data, the posterior credible interval for the cross-sectional odds-ratios.

t.stat

The posterior t-statistic for the coefficients.

n.stars

The significance level for the variable (see above).

rank

The variable importance rank as estimated by the Bayesian feature ranking algorithm (see above).

ESS

The effective sample size for the variable.

log.l0

For binary data, the log-likelihood of the null model (i.e., with only an intercept).

Details

The summary method computes a number of summary statistics and displays these for each variable in a table, along with suitable header information.

For continuous target variables, the header information includes a posterior estimate of the standard deviation of the random disturbances (errors), the R^2 statistic and the Widely applicable information criterion (WAIC) statistic. For logistic regression models, the header information includes the negative log-likelihood at the posterior mean of the regression coefficients, the pseudo R^2 score and the WAIC statistic. For count data (Poisson and geometric), the header information includes an estimate of the degree of overdispersion (observed variance divided by expected variance around the conditional mean, with a value < 1 indicating underdispersion), the pseudo R^2 score and the WAIC statistic.

The main table summarises properties of the coefficients for each of the variables. The first column is the variable name. The second and third columns are either the mean and standard error of the coefficients, or the median and standard error of the cross-sectional odds-ratios if display.OR=TRUE.

The fourth and fifth columns are the end-points of the credible intervals of the coefficients (odds-ratios). The sixth column displays the posterior t-statistic, calculated as the ratio of the posterior mean on the posterior standard deviation for the coefficient. The seventh column is the importance rank assigned to the variable by the Bayesian feature ranking algorithm.

In between the seventh and eighth columns are up to two asterisks indicating significance; a variable scores a first asterisk if the 75% credible interval does not include zero, and scores a second asterisk if the 95% credible interval does not include zero. The final column gives an estimate of the effective sample size for the variable, ranging from 0 to n.samples, which indicates the effective number of i.i.d draws from the posterior (if we could do this instead of using MCMC) represented by the samples we have drawn. This quantity is computed using the algorithm presented in the Stan Bayesian sampling package documentation.

See Also

The model fitting function bayesreg and prediction function predict.bayesreg.

Examples


X = matrix(rnorm(100*20),100,20)
b = matrix(0,20,1)
b[1:9] = c(0,0,0,0,5,4,3,2,1)
y = X %*% b + rnorm(100, 0, 1)
df <- data.frame(X,y)

# Horseshoe regression (using max 2 cores for CRAN check compliance)
rv.hs <- bayesreg(y~.,df,prior="hs",n.cores=2)       

# Summarise without sorting by variable rank
rv.hs.s <- summary(rv.hs)

# Summarise sorting by variable rank and provide 75% credible intervals
rv.hs.s <- summary(rv.hs, sort.rank = TRUE, CI=75)


bayesreg documentation built on Sept. 30, 2024, 9:18 a.m.