Description Usage Arguments Details Value Author(s) References See Also Examples
Core function for the package. A call to modelSampler
initiates a Gibbs
sampler for drawing values from the posterior of a rescaled spike and slab model.
Results from the Gibbs sampler are used to derive optimal AIC, BIC and highest
posterior models from a restricted model search. The core function can also be called
from its bootstrap wrapper, boot.modelSampler
which can be used to assess the stability of
AIC and BIC model selection as well as providing a more stable set of variables.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
formula |
A symbolic description of the model that is to be fit. |
data |
Data frame containing the predictors (variables) in the model. |
V.small |
Small hypervariance set to implement selective shrinkage. |
V.big |
Big hypervariance. If null, V.big = n, sample size |
n.iter1 |
Number of burn-in iterations. |
n.iter2 |
Number of iterations sampled after burn-in. |
fast |
Break up beta update into 'beta.blocks' chunks. Typically set to 'FALSE'. |
beta.blocks |
Size of beta updates (only used when fast=TRUE). |
complexity |
Model complexity parameter, which is estimated by Gibbs sampler. |
verbose |
Print iterations and other user friendly outputs. |
seed |
Set random generator seed. |
... |
Further arguments passed to or from other methods. |
The specially designed Bayesian rescaled spike and slab model is
designed to induce a type of regularization called selective shrinkage (for details see,
reference). Selective shrinkage is due to the type of two-point prior used for the hypervariance in the
prior as well as the choice of V.big
, which by default is set to the sample size.
An object of class modelSampler
, which is a list with the
following components:
formula |
The original formula used in calling |
modelTracker |
Total models visited after burn-in sampling. |
beta.all |
Sampled beta values after burn-in sampling. |
FPE |
Lists of variables selected by AIC and BIC. Also returns posterior inclusion probability of each variable. |
FPEstrat |
Returns top models stratified by size. Selection criterion is minimum residual sum of squares (RSS). |
FPEstart.pen |
Returns FPE values of the models stratified by model size. Also returns
frequencies of models visited by |
hpm |
Returns the posterior inclusion probability of each variable. |
mss |
Returns minimum RSS values of each model visited by |
aic |
Returns AIC values of each model visited by |
bic |
Returns BIC values of each model visited by |
coverage |
Returns a vector of probability of visiting a new model at each iteration
visited by |
complexity |
Returns a vector of estimated complexity parameters at each iteration by |
Tanujit Dey tanujit.dey@gmail.com
Ishwaran, H. and Rao, J. S. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection. J. Amer. Stat. Assoc., 98, 438 – 455.
Ishwaran, H. and Rao, J. S. (2005). Spike and slab gene selection for multigroup microarray data. J. Amer. Stat. Assoc., 100, 764 – 780.
Ishwaran, H. and Rao, J. S. (2005). Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Statist., 33, 730 – 773.
Dey, T. (2013). modelSampler: An R Tool for Variable Selection and Model Exploration in Linear Regression. Journal of Data Science, 11(2), 371–387.
boot.modelSampler
,
print.boot.modelSampler
,
print.modelSampler
,
plot.modelSampler
,
plot.icicle
,
plot.FPE
,
plot.var.stability
,
plot.ooberror
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | # Example 1:
data(Pollute, package = "modelSampler")
ms.out <- modelSampler(MortRate~., Pollute, n.iter1=2500,
n.iter2=2500, verbose=TRUE)
# Print several outputs from modelSampler.
print(ms.out)
# Returns a collection of graphics which includes a complexity plot;
# a penalization plot which depicts model size specific estimated
# minimum residual sum of squares, AIC, BIC values; a dimensionality plot
# of several model sizes visited by modelSampler;
# an image plot to visualize variable importance
# and a coverage plot depicting the probability of visiting new model by Gibbs sampler.
# For details of each plot, see plot.modelSampler.
plot.modelSampler(ms.out)
# Based on preliminary analysis, an out-of-bag technique is used
# estimate prediction error (PE). Based on estimated PE,
# the best model of size "k" is being selected.
ms.boot <- boot.modelSampler(MortRate~., Pollute, n.iter1=2500,
n.iter2=2500, B=20, verbose = TRUE)
# Prints selected subset of variables, based on estimated prediction error.
print(ms.boot)
# This plot will give an idea about instability of FPE model selection criteria.
plot.FPE(ms.boot)
# This plot will depict the model space.
plot.icicle(ms.boot, main="The Pollute data")
# Graphical visualization for selecting "the" best model based on estimated
# prediction error of hard shrunk predictors.
plot.ooberror(ms.boot, main="The Pollute data")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.