Oversampling for Probability Estimation

Description Usage Arguments Value Note References Examples

Perform probability estimation using jittering with over or undersampling.

1
2
3

jous(X, y, class_func, pred_func, type = c("under", "over"), delta = 10,
  nu = 1, X_pred = NULL, keep_models = FALSE, verbose = FALSE,
  parallel = FALSE, packages = NULL)

`X`	A matrix of continuous predictors.
`y`	A vector of responses with entries in `c(-1, 1)`.
`class_func`	Function to perform classification. This function definition must be exactly of the form `class_func(X, y)` where X is a matrix and y is a vector with entries in `c(-1, 1)`, and it must return an object on which `pred_func` can create predictions. See examples.
`pred_func`	Function to create predictions. This function definition must be exactly of the form `pred_func(fit_obj, X)` where `fit_obj` is an object returned by class_func and X is a matrix of new data values, and it must return a vector with entries in `c(-1, 1)`. See examples.
`type`	Type of sampling: "over" for oversampling, or "under" for undersampling.
`delta`	An integer (greater than 3) to control the number of quantiles to estimate:
`nu`	The amount of noise to apply to predictors when oversampling data. The noise level is controlled by `nu * sd(X[,j])` for each predictor - the default of `nu = 1` works well. Such "jittering" of the predictors is essential when applying `jous` to boosting type methods.
`X_pred`	A matrix of predictors for which to form probability estimates.
`keep_models`	Whether to store all of the models used to create the probability estimates. If `type=FALSE`, the user will need to re-run `jous` when creating probability estimates for test data.
`verbose`	If `TRUE`, print the function's progress to the terminal.
`parallel`	If `TRUE`, use parallel `foreach` to fit models. Must register parallel before hand, such as `doParallel`. See examples below.
`packages`	If `parallel = TRUE`, a vector of strings containing the names of any packages used in `class_func` or `pred_func`. See examples below.

Returns a list containing information about the parameters used in the jous function call, as well as the following additional components:

`q`	The vector of target quantiles estimated by `jous`. Note that the estimated probabilities will be located at the midpoints of the values in `q`.
`phat_train`	The in-sample probability estimates p(y=1\|x).
`phat_test`	Probability estimates for the optional test data in `X_test`
`models`	If `keep_models=TRUE`, a list of models fitted to the resampled data sets.
`confusion_matrix`	A confusion matrix for the in-sample fits.

The jous function runs the classifier class_func a total of delta times on the data, which can be computationally expensive. Also,jous cannot yet be applied to categorical predictors - in the oversampling case, it is not clear how to "jitter" a categorical variable.

Mease, D., Wyner, A. and Buja, A. (2007). Costweighted boosting with jittering and over/under-sampling: JOUS-boost. J. Machine Learning Research 8 409-439.

## Not run: 
# Generate data from Friedman model #
set.seed(111)
dat = friedman_data(n = 500, gamma = 0.5)
train_index = sample(1:500, 400)

# Apply jous to adaboost classifier
class_func = function(X, y) adaboost(X, y, tree_depth = 2, n_rounds = 200)
pred_func = function(fit_obj, X_test) predict(fit_obj, X_test)

jous_fit = jous(dat$X[train_index,], dat$y[train_index], class_func,
                pred_func, keep_models = TRUE)
# get probability
phat_jous = predict(jous_fit, dat$X[-train_index, ], type = "prob")

# compare with probability from AdaBoost
ada = adaboost(dat$X[train_index,], dat$y[train_index], tree_depth = 2,
               n_rounds = 200)
phat_ada = predict(ada, dat$X[train_index,], type = "prob")

mean((phat_jous - dat$p[-train_index])^2)
mean((phat_ada - dat$p[-train_index])^2)

## Example using parallel option

library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)

# n.b. the packages='rpart' is not really needed here since it gets
# exported automatically by JOUSBoost, but for illustration
jous_fit = jous(dat$X[train_index,], dat$y[train_index], class_func,
                pred_func, keep_models = TRUE, parallel = TRUE,
                packages = 'rpart')
phat = predict(jous_fit, dat$X[-train_index,], type = 'prob')
stopCluster(cl)

## Example using SVM

library(kernlab)
class_func = function(X, y) ksvm(X, as.factor(y), kernel = 'rbfdot')
pred_func = function(obj, X) as.numeric(as.character(predict(obj, X)))
jous_obj = jous(dat$X[train_index,], dat$y[train_index], class_func = class_func,
           pred_func = pred_func, keep_models = TRUE)
jous_pred = predict(jous_obj, dat$X[-train_index,], type = 'prob')

## End(Not run)