cvpre: Full k-fold cross validation of a prediction rule ensemble...
In pre: Prediction Rule Ensembles

cvpre

R Documentation

Full k-fold cross validation of a prediction rule ensemble (pre)

Description

cvpre performs k-fold cross validation on the dataset used to create the specified prediction rule ensemble, providing an estimate of predictive accuracy on future observations.

Usage

cvpre(
  object,
  k = 10,
  penalty.par.val = "lambda.1se",
  pclass = 0.5,
  foldids = NULL,
  verbose = FALSE,
  parallel = FALSE,
  print = TRUE,
  ...
)

Arguments

`object`	An object of class `pre`.
`k`	integer. The number of cross validation folds to be used.
`penalty.par.val`	character or numeric. Value of the penalty parameter `\lambda` to be employed for selecting the final ensemble. The default `"lambda.min"` employs the `\lambda` value within 1 standard error of the minimum cross-validated error. Alternatively, `"lambda.min"` may be specified, to employ the `\lambda` value with minimum cross-validated error, or a numeric value `>0` may be specified, with higher values yielding a sparser ensemble. To evaluate the trade-off between accuracy and sparsity of the final ensemble, inspect `pre_object$glmnet.fit` and `plot(pre_object$glmnet.fit)`.
`pclass`	numeric. Only used for binary classification. Cut-off value for the predicted probabilities that should be used to classify observations to the second class.
`foldids`	numeric vector of `length(nrow(object$data))` (the number of observations in the training data used to fit the original ensemble). Defaults to `NULL`, resulting in the original training observations being randomly assigned to one of the `k` folds. Depending on sample size, the number of factors in the data, the number of factor levels and their distributions, the default may yield errors. See 'Details'.
`verbose`	logical. Should progress of the cross validation be printed to the command line?
`parallel`	logical. Should parallel foreach be used? Must register parallel beforehand, such as doMC or others.
`print`	logical. Should accuracy estimates be printed to the command line?
`...`	Further arguments to be passed to `predict.pre`.

Details

The random sampling employed by default may yield folds including all observations with a given level of a given factor. This results in an error, as it requires predictions for factor levels to be computed that were not observed in the training data, which is impossible. By manually specifying the foldids argument, users can make sure all class levels are represented in each of the k training partitions.

Value

Calculates cross-validated estimates of predictive accuracy and prints these to the command line. For survival regression, accuracy is not calculated, as there is currently no agreed-upon way to best quantify accuracy in survival regression models. Users can compute their own accuracy estimates using the (invisibly returned) cross-validated predictions ($cvpreds). Invisibly, a list of three objects is returned: accuracy (containing accuracy estimates), cvpreds (containing cross-validated predictions) and fold_indicators (a vector indicating the cross validation fold each observation was part of). For (multivariate) continuous outcomes, accuracy is a list with elements $MSE (mean squared error on test observations) and $MAE (mean absolute error on test observations). For (binary and multiclass) classification, accuracy is a list with elements $SEL (mean squared error on predicted probabilities), $AEL (mean absolute error on predicted probabilities), $MCR (average misclassification error rate) and $table (proportion table with (mis)classification rates).

Examples

set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airquality[complete.cases(airquality),])
airq.cv <- cvpre(airq.ens)

pre documentation built on May 29, 2024, 5:10 a.m.

pre index

README.md Dealing with missing data in fitting prediction rule ensembles" More adaptive or relaxed: Fitting sparser rule ensembles with relaxed and/or adaptive lasso" Speeding up computations" Tuning the parameters of function pre"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pre
Prediction Rule Ensembles

cvpre: Full k-fold cross validation of a prediction rule ensemble...
In pre: Prediction Rule Ensembles

Full k-fold cross validation of a prediction rule ensemble (pre)

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to cvpre in pre...

R Package Documentation

Browse R Packages

We want your feedback!

pre Prediction Rule Ensembles

cvpre: Full k-fold cross validation of a prediction rule ensemble... In pre: Prediction Rule Ensembles

Full k-fold cross validation of a prediction rule ensemble (pre)

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to cvpre in pre...

R Package Documentation

Browse R Packages

We want your feedback!

pre
Prediction Rule Ensembles

cvpre: Full k-fold cross validation of a prediction rule ensemble...
In pre: Prediction Rule Ensembles