View source: R/basic_complementary_functions.R
posteriorpredictive.pep | R Documentation |
Simulates values from the posterior predictive distribution under Bayesian model averaging.
posteriorpredictive.pep(
object,
xnew,
ssize = 10000,
estimator = "BMA",
n.models = NULL,
cumul.prob = 0.99
)
object |
An object of class pep (e.g., output of |
xnew |
An optional data frame of numeric, the new data on the explanatory variables to be used for prediction. The data frame needs to contain information about all explanatory variables available in the full model; if not an error message is output. If omitted, the data frame employed for fitting the full model is used. |
ssize |
Positive integer, the number of values to be simulated from each posterior predictive distribution. Default value=10000. |
estimator |
A character, the type of prediction. One of
“BMA” (Bayesian model averaging, default),
“MAP” (maximum a posteriori model) or “MPM” (median probability model).
Default value= |
n.models |
Positive integer, the number of (top) models where
the average is based on or |
cumul.prob |
Numeric between zero and one, cumulative probability of
top models to be used for computing the average. Relevant for |
For the computations, Equation 11 of Garcia–Donato and Forte (2018) is used. That (simplified) formula arises when changing the prior on the model parameters to the reference prior. This change of prior is justified in Garcia–Donato and Forte (2018). The resulting formula is a mixture distribution and the simulation is implemented as follows: firstly the model (component) based on its posterior probability is chosen and subsequently the value for the response is drawn from the corresponding Student distribution.
The case of missing data (i.e., presence of NA’s) and non–quantitative data
in the new data frame
xnew
is not currently supported.
Let k
be the number of models with cumulative posterior probability up
to the given value of cumul.prob
. Then, for Bayesian model averaging
the prediction is based on the top (k+1)
models if they exist, otherwise
on the top k
models.
When both n.models
and cumul.prob
are provided — once
specifying the number of models for the given cumulative probability as
described above — the minimum between the two numbers is used for prediction.
posteriorpredictive.pep
returns a matrix (of dimension
ssize
\times
nrow(xnew)
) — containing the
simulated data. More specifically, column i
contains the simulated
values from the posterior predictive corresponding to the i
–th new
observation (i.e., i
–th row of xnew
).
Garcia–Donato, G. and Forte, A. (2018) Bayesian Testing, Variable Selection and Model Averaging in Linear Models using R with BayesVarSel. The R Journal, 10(1): 155–174. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.32614/RJ-2018-021")}
data(UScrime_data)
X <- UScrime_data[,-15]
set.seed(123)
res <- pep.lm(y~.,data=UScrime_data[1:45,],intrinsic=TRUE,
algorithmic.choice="MC3",itermc3=4000)
resf <- posteriorpredictive.pep(res,ssize=2000,n.models=5)
resf2 <- posteriorpredictive.pep(res,ssize=2000,estimator="MPM")
resp <- posteriorpredictive.pep(res,xnew=X[46:47,],ssize=2000,n.models=5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.