posteriorpredictive.pep: Posterior predictive distribution under Bayesian model...

View source: R/basic_complementary_functions.R

posteriorpredictive.pepR Documentation

Posterior predictive distribution under Bayesian model averaging

Description

Simulates values from the posterior predictive distribution under Bayesian model averaging.

Usage

posteriorpredictive.pep(
  object,
  xnew,
  ssize = 10000,
  estimator = "BMA",
  n.models = NULL,
  cumul.prob = 0.99
)

Arguments

object

An object of class pep (e.g., output of pep.lm).

xnew

An optional data frame of numeric, the new data on the explanatory variables to be used for prediction. The data frame needs to contain information about all explanatory variables available in the full model; if not an error message is output. If omitted, the data frame employed for fitting the full model is used.

ssize

Positive integer, the number of values to be simulated from each posterior predictive distribution. Default value=10000.

estimator

A character, the type of prediction. One of “BMA” (Bayesian model averaging, default), “MAP” (maximum a posteriori model) or “MPM” (median probability model). Default value="BMA".

n.models

Positive integer, the number of (top) models where the average is based on or NULL. Relevant for estimator="BMA". Default value=NULL.

cumul.prob

Numeric between zero and one, cumulative probability of top models to be used for computing the average. Relevant for estimator="BMA". Default value=0.99.

Details

For the computations, Equation 11 of Garcia–Donato and Forte (2018) is used. That (simplified) formula arises when changing the prior on the model parameters to the reference prior. This change of prior is justified in Garcia–Donato and Forte (2018). The resulting formula is a mixture distribution and the simulation is implemented as follows: firstly the model (component) based on its posterior probability is chosen and subsequently the value for the response is drawn from the corresponding Student distribution.

The case of missing data (i.e., presence of NA’s) and non–quantitative data in the new data frame xnew is not currently supported.

Let k be the number of models with cumulative posterior probability up to the given value of cumul.prob. Then, for Bayesian model averaging the prediction is based on the top (k+1) models if they exist, otherwise on the top k models.

When both n.models and cumul.prob are provided — once specifying the number of models for the given cumulative probability as described above — the minimum between the two numbers is used for prediction.

Value

posteriorpredictive.pep returns a matrix (of dimension ssize \times nrow(xnew)) — containing the simulated data. More specifically, column i contains the simulated values from the posterior predictive corresponding to the i–th new observation (i.e., i–th row of xnew).

References

Garcia–Donato, G. and Forte, A. (2018) Bayesian Testing, Variable Selection and Model Averaging in Linear Models using R with BayesVarSel. The R Journal, 10(1): 155–174. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.32614/RJ-2018-021")}

Examples

data(UScrime_data)
X <- UScrime_data[,-15]
set.seed(123)
res <- pep.lm(y~.,data=UScrime_data[1:45,],intrinsic=TRUE,
               algorithmic.choice="MC3",itermc3=4000)
resf <- posteriorpredictive.pep(res,ssize=2000,n.models=5)
resf2 <- posteriorpredictive.pep(res,ssize=2000,estimator="MPM")
resp <- posteriorpredictive.pep(res,xnew=X[46:47,],ssize=2000,n.models=5)

PEPBVS documentation built on April 3, 2025, 6:12 p.m.