This function applies a cross-validation (CV) procedure for training Bayesian
models with hierarchical shrinkage priors using the
The function allows the option of embedded filtering of predictors for
feature selection within the CV loop. Within each training fold, an optional
filtering of predictors is performed, followed by fitting of an
model. Predictions on the testing folds are brought back together and error
estimation/ accuracy determined. The default is 10-fold CV.
The function is implemented within the
nestedcv package. The
models do not require tuning of meta-parameters and therefore only a single
CV procedure is needed to evaluate performance. This is implemented using the
outer CV procedure in the
model.hsstan(y, x, unpenalized = NULL, ...)
Response vector. For classification this should be a factor.
Matrix of predictors
Vector of column names
Optional arguments passed to
An object of class
# Cross-validation is used to apply univariate filtering of predictors. # only one CV split is needed (outercv) as the Bayesian model does not # require learning of meta-parameters. # load iris dataset and simulate a continuous outcome data(iris) dt <- iris[, 1:4] colnames(dt) <- c("marker1", "marker2", "marker3", "marker4") dt <- as.data.frame(apply(dt, 2, scale)) dt$outcome.cont <- -3 + 0.5 * dt$marker1 + 2 * dt$marker2 + rnorm(nrow(dt), 0, 2) # unpenalised covariates: always retain in the prediction model uvars <- "marker1" # penalised covariates: coefficients are drawn from hierarchical shrinkage # prior pvars <- c("marker2", "marker3", "marker4") # penalised covariates # run cross-validation with univariate filter and hsstan # dummy sampling for fast execution of example # recommend 4 chains, warmup 1000, iter 2000 in practice oldopt <- options(mc.cores = 2) res.cv.hsstan <- outercv(y = dt$outcome.cont, x = dt[, c(uvars, pvars)], model = model.hsstan, filterFUN = lm_filter, filter_options = list(force_vars = uvars, nfilter = 2, p_cutoff = NULL, rsq_cutoff = 0.9), n_outer_folds = 3, chains = 2, unpenalized = uvars, warmup = 100, iter = 200) # view prediction performance based on testing folds res.cv.hsstan$summary # view coefficients for the final model res.cv.hsstan$final_fit # view covariates selected by the univariate filter res.cv.hsstan$final_vars # load hsstan package to examine the Bayesian model library(hsstan) sampler.stats(res.cv.hsstan$final_fit) print(projsel(res.cv.hsstan$final_fit), digits = 4) # adding marker2 options(oldopt) # Here adding `marker2` improves the model fit: substantial decrease of # KL-divergence from the full model to the submodel. Adding `marker3` does # not improve the model fit: no decrease of KL-divergence from the full model # to the submodel.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.