spikeSlabGAM | R Documentation |
This function fits structured additive regression models with spike-and-slab
priors via MCMC. The spike-and-slab prior results in an SSVS-like term
selection that can be used to aid model choice, e.g. to decide between linear
and smooth modelling of numeric covariates or which interaction effects are
relevant. Model terms can be factors (fct
), linear/polynomial
terms (lin
), uni- or bivariate splines (sm
,
srf
), random intercepts (rnd
) or Markov random
fields (mrf
) and their interactions, i.e. an interaction
between two smooth terms yields an effect surface, an interaction between a
linear term and a random intercept yields random slopes, an interaction
between a linear term and a smooth term yields a varying coefficient term
etc.
spikeSlabGAM( formula, data, ..., family = "gaussian", hyperparameters = list(), model = list(), mcmc = list(), start = list() )
formula |
the model formula (see details below and
|
data |
the data containing the variables in the function |
... |
additional arguments for |
family |
(character) the family of the response, defaults to
normal/Gaussian response, |
hyperparameters |
A list of arguments specifying the prior settings. See
|
model |
A list of arguments describing the model structure. See
|
mcmc |
A list of arguments specifying MCMC sampler options. See
|
start |
A list of starting values for the MCMC sampler. See
|
Implemented types of terms:
u
unpenalized model terms associated with a flat prior (no selection)
lin
linear/polynomial terms
fct
factors
sm
univariate smooth functions
rnd
random intercepts
mrf
Markov random fields
srf
surface estimation
Terms in the formula that are
not in the list of specials (i.e. the list of term types above) are
automatically assigned an appropriate special wrapper, i.e. numerical
covariates x
are treated as lin(x) + sm(x)
,
factors f
(and numerical covariates with very few distinct values, see
ssGAMDesign
) are treated as fct(f)
. Valid model
formulas have to satisfy the following conditions:
All
variables that are involved in interactions have to be present as main
effects as well, i.e. y ~ x1 + x1:x2
will produce an error because
the main effect of x2
is missing.
Interactions between
unpenalized terms not under selection (i.e. terms of type u
)
and penalized terms are not allowed, i.e. y ~ u(x1)* x2
will produce an
error.
If main effects are specified as special terms, the interactions
involving them have to be given as special terms as well, i.e. y ~
lin(x1) + lin(x2) + x1:x2
will produce an error.
The default prior settings for Gaussian data work best for a response with unit variance. If your data is scaled very differently, either rescale the response (recommended) or adjust the hyperparameters accordingly.
Sampling of the chains is done in parallel using package multicore
if
available. If not, a socket cluster set up with snow
is used where
available. Use options(mc.cores = foo)
to set the (maximal) number of
processes spawned by the parallelization. If options()$mc.cores
is
unspecified, snow uses 8.
an object of class spikeSlabGAM
with methods
summary.spikeSlabGAM
, predict.spikeSlabGAM
, and
plot.spikeSlabGAM
.
Fabian Scheipl
Fabian Scheipl (2011). spikeSlabGAM
: Bayesian Variable
Selection, Model Choice and Regularization for Generalized Additive Mixed
Models in R. Journal of Statistical Software, 43(14), 1–24.
Fabian Scheipl, Ludwig Fahrmeir, Thomas Kneib (2012). Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models. Journal of the American Statistical Association, 107(500), 1518–1532.
ssGAMDesign
for details on model specification,
spikeAndSlab
for more details on the MCMC sampler and prior
specification, and ssGAM2Bugs
for MCMC diagnostics. Check out
the vignette for theoretical background and code examples.
## Not run: ## examples not run due to time constraints on CRAN checks. ## full examples below should take about 2-4 minutes. set.seed(91179) n <- 400 d <- data.frame(f1 = sample(gl(3, n/3)), f2 = sample(gl(4, n/4)), x1 = runif(n), x2 = runif(n), x3 = runif(n)) # true model: # - interaction between f1 and x1 # - smooth interaction between x1 and x2, # - x3 and f2 are noise variables without influence on y nf1 <- as.numeric(d$f1) d$f <- with(d, 5 * (nf1 + 2 * sin(4 * pi * (x1 - 0.2) * (x2 - 0.7)) - nf1 * x1)) d$y <- with(d, scale(f + rnorm(n))) d$yp <- with(d, rpois(n, exp(f/10))) # fit & display the model: m1 <- spikeSlabGAM(y ~ x1 * f1 + f1 * f2 + x3 * f1 + x1 * x2, data = d) summary(m1) # visualize estimates: plot(m1) plot(m1, cumulative = FALSE) (plotTerm("fct(f1):fct(f2)", m1, aggregate = median)) print(p <- plotTerm("sm(x1):sm(x2)", m1, quantiles = c(0.25, 0.75), cumulative = FALSE)) # change MCMC settings and priors: mcmc <- list(nChains = 3, burnin = 100, chainLength = 1000, thin = 3, reduceRet = TRUE) hyper <- list(gamma = c(v0 = 5e-04), tau2 = c(10, 30), w = c(2, 1)) # complicated formula example, poisson response: m2 <- spikeSlabGAM(yp ~ x1 * (x2 + f1) + (x2 + x3 + f2)^2 - sm(x2):sm(x3), data = d, family = "poisson", mcmc = mcmc, hyperparameters = hyper) summary(m2) plot(m2) # quick&dirty convergence diagnostics: print(b <- ssGAM2Bugs(m2)) plot(b) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.