modelSelection: Bayesian variable selection for linear models via non-local...
In mombf: Bayesian Model Selection and Averaging for Non-Local and Local Priors

Description Usage Arguments Details Value Author(s) References See Also Examples

Bayesian model selection for linear, asymmetric linear, median and quantile regression under non-local or Zellner priors. p>>n can be handled.

modelSelection enumerates all models when feasible and uses a Gibbs scheme otherwise. See coef and coefByModel for estimates and posterior intervals of regression coefficients, and rnlp for posterior samples.

modelsearchBlockDiag seeks the highest posterior probability model using an iterative block search.

modelSelection(y, x, data, smoothterms, nknots=9,
groups=1:ncol(x), constraints, center=TRUE, scale=TRUE,
enumerate, includevars=rep(FALSE,ncol(x)),
maxvars, niter=5000, thinning=1,
burnin=round(niter/10), family='normal', priorCoef,
priorGroup, priorDelta=modelbbprior(1,1),
priorConstraints=priorDelta,
priorVar=igprior(.01,.01),
priorSkew=momprior(tau=0.348), phi, deltaini=rep(FALSE,ncol(x)),
initSearch='greedy', method='auto', hess='asymp', optimMethod='CDA',
B=10^5, XtXprecomp= ifelse(ncol(x)<10^4,TRUE,FALSE), verbose=TRUE)

modelsearchBlockDiag(y, x, priorCoef=momprior(tau=0.348),
priorDelta=modelbbprior(1,1), priorVar=igprior(0.01,0.01),
blocksize=10, maxiter=10, maxvars=100, maxlogmargdrop=20,
maxenum=10, verbose=TRUE)

`y`	Either a formula with the regression equation or a vector with observed responses. The response can be either continuous or of class `Surv` (survival outcome). If `y` is a formula then `x`, `groups` and `constraints` are automatically created
`x`	Design matrix with linear covariates for which we want to assess if they have a linear effect on the response. Ignored if `y` is a formula
`data`	If `y` is a formula then `data` should be a data frame containing the variables in the model
`smoothterms`	Formula for non-linear covariates (cubic splines), modelSelection assesses if the variable has no effect, linear or non-linear effect. `smoothterms` can also be a design matrix or data.frame containing linear terms, for each column modelSelection creates a spline basis and tests no/linear/non-linear effects
`nknots`	Number of spline knots. For cubic splines the non-linear basis adds knots-4 coefficients for each linear term, we recommend setting `nknots` to a small/moderate value
`groups`	If variables in `x` such be added/dropped in groups, `groups` indicates the group that each variable corresponds to (by default each variable goes in a separate group)
`constraints`	Constraints on the model space. List with length equal to the number of groups; if group[[i]]=c(j,k) then group i can only be in the model if groups j and k are also in the model
`center`	If `TRUE`, `y` and `x` are centered to have zero mean. Dummy variables corresponding to factors are NOT centered
`scale`	If `TRUE`, `y` and columns in `x` are scaled to have variance=1. Dummy variables corresponding to factors are NOT scaled
`enumerate`	Default is `TRUE` if there's less than 15 variable groups. If `TRUE` all models with up to `maxvars` are enumerated, else Gibbs sampling is used to explore the model space
`includevars`	Logical vector of length ncol(x) indicating variables that should always be included in the model, i.e. variable selection is not performed for these variables
`maxvars`	When `enumerate==TRUE` only models with up to maxvars variables enumerated (defaults to all variables). In `modelsearchBlockDiag` a sequence of models is defined from 1 up to `maxvars`
`niter`	Number of Gibbs sampling iterations
`thinning`	MCMC thinning factor, i.e. only one out of each `thinning` iterations are reported. Defaults to thinning=1, i.e. no thinning
`burnin`	Number of burn-in MCMC iterations. Defaults to `.1*niter`. Set to 0 for no burn-in
`family`	Residual distribution. Possible values are 'normal','twopiecenormal','laplace', 'twopiecelaplace', or 'auto'. For the latter the residual distribution is inferred from the data. 'laplace' corresponds to median regression and 'twopiecelaplace' to quantile regression. See argument `priorSkew`
`priorCoef`	Prior on coefficients, created by `momprior`, `imomprior`, `emomprior` or `zellnerprior`. Prior dispersion is on coefficients/sqrt(scale) for Normal and two-piece Normal, and on coefficients/sqrt(2*scale) for Laplace and two-piece Laplace.
`priorGroup`	Prior on grouped coefficients (e.g. categorical predictors with >2 categories, splines). Created by `groupmomprior`, `groupemomprior`, `groupimomprior` or `groupzellnerprior`
`priorDelta`	Prior on model space. Use `modelbbprior()` for Beta-Binomial prior, `modelbinomprior(p)` for Binomial prior with prior inclusion probability `p`, `modelcomplexprior` for Complexity prior, or `modelunifprior()` for Uniform prior
`priorConstraints`	Prior distribution on the number of terms subject to hierarchical constrains that are included in the model
`priorVar`	Inverse gamma prior on scale parameter. For Normal outcomes variance=scale, for Laplace outcomes variance=2*scale
`priorSkew`	Either a fixed value for tanh(alpha) where alpha is the asymmetry parameter or a prior on tanh(alpha). For `family=='twopiecelaplace'` setting alpha=a is equivalent to performing quantile regression for the quantile (1+a)/2. Ignored if `family` is 'normal' or 'laplace'.
`phi`	Residual variance. Typically this is unknown and therefore left missing. If specified argument priorVar is ignored
`deltaini`	Logical vector of length `ncol(x)` indicating which coefficients should be initialized to be non-zero. Defaults to all variables being excluded from the model
`initSearch`	Algorithm to refine `deltaini`. `initSearch=='greedy'` uses a greedy Gibbs sampling search. `initSearch=='SCAD'` sets `deltaini` to the non-zero elements in a SCAD fit with cross-validated regularization parameter. `initSearch=='none'` leaves `deltaini` unmodified
`method`	Method to compute marginal likelihood. The default is to used closed-form expressions whenever possible (only available for pMOM priors with up to 15 covariates) and Laplace approximations otherwise. `method=='Laplace'` for Laplace approx, `method=='plugin'` for BIC-type approximation, `method=='MC'` for Importance Sampling, `method=='Hybrid'` for Hybrid Laplace-IS (only available for piMOM prior). See Details.
`hess`	Method to estimat the hessian in the Laplace approximation to the integrated likelihood under Laplace or asymmetric Laplace errors. When hess=='asymp' the asymptotic hessian is used, hess=='asympDiagAdj' a diagonal adjustment is applied (see Rossell and Rubio for details).
`optimMethod`	Algorithm to maximize objective function when method=='Laplace' or method=='MC'. optimMethod=='LMA' uses modified Newton-Raphson algorithm, 'CDA' coordinate descent algorithm
`B`	Number of samples to use in Importance Sampling scheme. Ignored if `method=='Laplace'`
`XtXprecomp`	Set to `TRUE` to pre-compute the Gram matrix x'x upfront (saves time), to `FALSE` to compute and store elements only as needed (saves memory)
`verbose`	Set `verbose==TRUE` to print iteration progress
`blocksize`	Maximum number of variables in a block. Careful, the cost of the algorithm is of order `2^blocksize`
`maxiter`	Maximum number of iterations, each iteration includes a screening pass to add and subtract variables
`maxlogmargdrop`	Stop the sequence of models when the drop in log p(y\|model) is greater than `maxlogmargdrop`. This option avoids spending unnecessary time exploring overly large models
`maxenum`	If the posterior mode found has less than `maxenum` variables then do a full enumeration of all its submodels

Let delta be the vector indicating inclusion/exclusion of each column of x in the model. The Gibbs algorithm sequentially samples from the posterior of each element in delta conditional on all the remaining elements in delta and the data. To do this it is necessary to evaluate the marginal likelihood for any given model. These have closed-form expression for the MOM prior, but for models with >15 variables these are expensive to compute and Laplace approximations are used instead (for the residual variance a log change of variables is used, which improves the approximation). For other priors closed forms are not available, so by default Laplace approximations are used. For the iMOM prior we also implement a Hybrid Laplace-IS which uses a Laplace approximation to evaluate the integral wrt beta and integrates wrt phi (residual variance) numerically.

It should be noted that Laplace approximations tend to under-estimate the marginal densities when the MLE for some parameter is very close to 0. That is, it tends to be conservative in the sense of excluding more variables from the model than an exact calculation would.

Finally, method=='plugin' provides a BIC-type approximation that is faster than exact or Laplace methods, at the expense of some accuracy. In non-sparse situations where models with many variables have large posterior probability method=='plugin' can be substantially faster.

For more details on the methods used to compute marginal densities see Johnson & Rossell (2012).

modelsearchBlockDiag uses the block search method described in Papaspiliopoulos & Rossell. Briefly, spectral clustering is run on X'X to cluster variables into blocks of blocksize and subsequently the Coolblock algorithm is used to define a sequence of models of increasing size. The exact integrated likelihood is evaluated for all models in this path, the best model chosen, and the scheme iteratively repeated to add and drop variables until convergence.

Object of class msfit, which extends a list with elements

`postSample`	`matrix` with posterior samples for the model indicator. `postSample[i,j]==1` indicates that variable j was included in the model in the MCMC iteration i
`postOther`	`postOther` returns posterior samples for parameters other than the model indicator, i.e. basically hyper-parameters. If hyper-parameters were fixed in the model specification, `postOther` will be empty.
`margpp`	Marginal posterior probability for inclusion of each covariate. This is computed by averaging marginal post prob for inclusion in each Gibbs iteration, which is much more accurate than simply taking `colMeans(postSample)`

`postMode`	Model with highest posterior probability amongst all those visited
`postModeProb`	Unnormalized posterior prob of posterior mode (log scale)
`postProb`	Unnormalized posterior prob of each visited model (log scale)
`priors`	List with priors specified when calling `modelSelection`

David Rossell

Johnson V.E., Rossell D. Non-Local Prior Densities for Default Bayesian Hypothesis Tests. Journal of the Royal Statistical Society B, 2010, 72, 143-170.

Johnson V.E., Rossell D. Bayesian model selection in high-dimensional settings. Journal of the American Statistical Association, 2012, 107, 649-660.

Papaspiliopoulos O., Rossell, D. Scalable Bayesian variable selection and model averaging under block orthogonal design. 2016

Rossell D., Rubio F.J. Tractable Bayesian variable selection: beyond normality. 2016

msfit-class for details on the output. postProb to obtain posterior model probabilities. coef.msfit for Bayesian model averaging estimates and intervals. predict.msfit for BMA estimates and intervals for user-supplied covariate values. rnlp to obtain posterior samples for the coefficients. nlpMarginal to compute marginal densities for a given model.

#Simulate data
x <- matrix(rnorm(100*3),nrow=100,ncol=3)
theta <- matrix(c(1,1,0),ncol=1)
y <- x %*% theta + rnorm(100)

#Specify prior parameters
priorCoef <- momprior(tau=0.348)
priorDelta <- modelunifprior()

#Alternative model space prior: 0.5 prior prob for including any covariate
priorDelta <- modelbinomprior(p=0.5)

#Alternative: Beta-Binomial prior for model space
priorDelta <- modelbbprior(alpha.p=1,beta.p=1)

#Model selection
fit1 <- modelSelection(y=y, x=x, center=FALSE, scale=FALSE,
priorCoef=priorCoef, priorDelta=priorDelta)
postProb(fit1) #posterior model probabilities

fit1$margpp #posterior marginal inclusion prob

coef(fit1) #BMA estimates, 95% intervals, marginal post prob

Loading required package: mvtnorm
Loading required package: ncvreg
Loading required package: mgcv
Loading required package: nlme
This is mgcv 1.8-28. For overview type 'help("mgcv-package")'.
Enumerating models...
Computing posterior probabilities........ Done.
  modelid family           pp
7     1,2 normal 9.862385e-01
8   1,2,3 normal 1.376150e-02
3       2 normal 2.453821e-13
4     2,3 normal 1.149752e-15
5       1 normal 3.291586e-21
6     1,3 normal 2.843992e-23
1         normal 1.062552e-24
2       3 normal 2.147270e-27
       x1        x2        x3 
1.0000000 1.0000000 0.0137615 
              estimate      2.5%    97.5%    margpp
intercept 0.000000e+00 0.0000000 0.000000 0.0000000
x1        9.424775e-01 0.7458553 1.141360 1.0000000
x2        1.233866e+00 1.0443558 1.425732 1.0000000
x3        5.486328e-05 0.0000000 0.000000 0.0137615
phi       1.094734e+00 0.8345910 1.433771 1.0000000