pep.lm | R Documentation |
Given a formula and a data frame, performs Bayesian variable selection using either full enumeration and evaluation of all models in the model space (for model spaces of small–to–moderate dimension) or the MC3 algorithm (for model spaces of large dimension). Normal linear models are assumed for the data with the prior distribution on the model parameters (beta coefficients and error variance) being the PEP or the intrinsic. The prior distribution on the model space can be the uniform on models or the uniform on the model dimension (special case of the beta–binomial prior). The model space consists of all possible models including an intercept term.
pep.lm(
formula,
data,
algorithmic.choice = "automatic",
intrinsic = FALSE,
reference.prior = TRUE,
beta.binom = TRUE,
ml_constant.term = FALSE,
burnin = 1000,
itermc3 = 11000
)
formula |
A formula, defining the full model. |
data |
A data frame (of numeric values), containing the data. |
algorithmic.choice |
A character, the type of algorithm to be used
for selection: full enumeration and evaluation of all models or the MC3 algorithm.
One of “automatic” (the choice is done automatically based on the number
of explanatory variables in the full model), “full enumeration”
or “MC3”. Default value= |
intrinsic |
Logical, indicating whether the PEP
( |
reference.prior |
Logical, indicating whether the reference prior
( |
beta.binom |
Logical, indicating whether the beta–binomial
distribution ( |
ml_constant.term |
Logical, indicating whether the constant
(marginal likelihood of the null/intercept–only model) should be
included in computing the marginal likelihood of a model ( |
burnin |
Non–negative integer, the burnin period for the MC3 algorithm. Default value=1000. |
itermc3 |
Positive integer (larger than |
The function works when p\leq n-2
, where p
is the number of explanatory variables
of the full model and n
is the sample size.
The reference model is the null model (i.e., intercept–only model).
The case of missing data (i.e., presence of NA
's either in the
response or the explanatory variables) is not currently supported. Further,
the data needs to be quantitative.
All models considered (i.e., model space) include an intercept term.
If p>1
, the explanatory variables cannot have an exact linear relationship
(perfect multicollinearity).
The reference prior as baseline corresponds to hyperparameter values
d0=0
and d1=0
, while the dependence Jeffreys prior corresponds to
model–dependent–based values for the hyperparameters d0
and d1
,
see Fouskakis and Ntzoufras (2022) for more details.
For computing the marginal likelihood of a model, Equation 16 of Fouskakis and Ntzoufras (2022) is used.
When ml_constant.term=FALSE
then the log marginal likelihood of a
model in the output is shifted by -logC1
(logC1: log marginal likelihood of the null model).
When the prior on the model space is beta–binomial
(i.e., beta.binom=TRUE
), the following special case is used: uniform
prior on model dimension.
If algorithmic.choice
equals “automatic” then the choice of
the selection algorithm is as follows: if p < 20
, full enumeration
and evaluation of all models in the model space is performed,
otherwise the MC3 algorithm is used.
To avoid potential memory or time constraints, if algorithmic.choice
equals “full enumeration” but p \geq 20
then the MC3 algorithm is
used instead (once issuing a warning message).
The MC3 algorithm was first introduced by Madigan and York (1995) while its current implementation is described in the Appendix of Fouskakis and Ntzoufras (2022).
pep.lm
returns an object of class pep
,
i.e., a list with the following elements:
models |
A matrix containing information about the models examined.
In particular, in row |
inc.probs |
A named vector with the posterior inclusion probabilities of the explanatory variables. |
x |
The input data matrix (of dimension |
y |
The response vector (of length |
fullmodel |
Formula, representing the full model. |
mapp |
For |
intrinsic |
Whether the prior on the model parameters was PEP or intrinsic. |
reference.prior |
Whether the baseline prior was the reference prior or the dependence Jeffreys prior. |
beta.binom |
Whether the prior on the model space was beta–binomial or uniform. |
When MC3 is run, there is the additional list element allvisitedmodsM
, a matrix of
dimension (itermcmc
-burnin
) \times \,(p+2)
containing all ‘visited’ models
(as variable inclusion indicators together with their corresponding
marginal likelihood and R2) by the MC3 algorithm after the burnin period.
Fouskakis, D. and Ntzoufras, I. (2022) Power–Expected–Posterior Priors as Mixtures of g–Priors in Normal Linear Models. Bayesian Analysis, 17(4): 1073-1099. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/21-BA1288")}
Madigan, D. and York, J. (1995) Bayesian Graphical Models for Discrete Data. International Statistical Review, 63(2): 215–232. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.2307/1403615")}
data(UScrime_data)
res <- pep.lm(y~.,data=UScrime_data)
resu <- pep.lm(y~.,data=UScrime_data,beta.binom=FALSE)
resi <- pep.lm(y~.,data=UScrime_data,intrinsic=TRUE)
set.seed(123)
res2 <- pep.lm(y~.,data=UScrime_data,algorithmic.choice="MC3",itermc3=2000)
resj2 <- pep.lm(y~.,data=UScrime_data,reference.prior=FALSE,
algorithmic.choice="MC3",burnin=20,itermc3=1800)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.