comp_pred: Fit and predict competing classification algorithms
In FFTrees: Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees

comp_pred

R Documentation

Fit and predict competing classification algorithms

Description

comp_pred provides a wrapper for running (i.e., fit or predict) alternative classification algorithms to data (i.e., data.train or data.test, respectively).

Usage

comp_pred(
  formula,
  data.train,
  data.test = NULL,
  algorithm = NULL,
  model = NULL,
  sens.w = NULL,
  new.factors = "exclude",
  quiet_mis = FALSE
)

Arguments

`formula`	A formula (usually `x$formula`, for an `FFTrees` object `x`).
`data.train`	A training dataset (as a data frame).
`data.test`	A testing dataset (as a data frame).
`algorithm`	A character string specifying an algorithm in the set: `"lr"`: Logistic regression (using `glm` from stats with `family = "binomial"`); `"rlr"`: Regularized logistic regression (currently not supported); `"cart"`: Decision trees (using `rpart` from rpart); `"svm"`: Support vector machines (using `svm` from e1071); `"rf"`: Random forests (using `randomForest` from randomForest.
`model`	An optional existing model (as a `model`), to be applied to the test data.
`sens.w`	Sensitivity weight parameter (numeric, from `0` to `1`), required to compute `wacc`.
`new.factors`	What should be done if new factor values are discovered in the test set (as a character string)? Available options: `"exclude"`: exclude case (i.e., remove these cases, used by default); `"base"`: predict the base rate of the criterion.
`quiet_mis`	A logical value passed to hide/show `NA` user feedback (usually `x$params$quiet$mis` of the calling function). Default: `quiet_mis = FALSE` (i.e., show user feedback).

Details

The range of competing algorithms currently available includes logistic regression (stats::glm), CART (rpart::rpart), support vector machines (e1071::svm), and random forests (randomForest::randomForest).

The current support for handling missing data (or NA values) is only rudimentary. When enabled (via the global options allow_NA_pred or allow_NA_crit), any rows in data.train or data.test with incomplete cases are being removed prior to fitting or predicting a model (by using na.omit from stats). See the specifications of each model for more sophisticated ways of handling missing data.