R/eztune.R
In EZtune: Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines

Documented in eztune

#' Supervised Learning Function
#'
#' \code{eztune} is a function that automatically tunes adaboost, support
#' vector machines, gradient boosting machines, and elastic net. An
#' optimization algorithm is used to find a good set of tuning parameters
#' for the selected model. The function optimizes on a validation dataset,
#' cross validated accuracy, or resubstitution accuracy.
#' @param x Matrix or data frame containing the dependent variables.
#' @param y Vector of responses. Can either be a factor or a numeric vector.
#' @param method Model to be fit. Choices are "\code{ada}" for adaboost,
#' "\code{en}" for elastic net, "\code{gbm}" for gradient boosting machines,
#' and "\code{svm}" for support
#' vector machines.
#' @param optimizer Optimization method. Options are "\code{ga}" for a genetic
#'  algorithm and "\code{hjn}" for a Hooke-Jeeves optimizer.
#' @param fast Indicates if the function should use a subset of the
#'  observations when optimizing to speed up calculation time. A value
#'  of \code{TRUE} will use the smaller of 50\% of the data or 200 observations
#'  for model fitting, a number between \code{0} and \code{1} specifies the
#'  proportion of data to be used to fit the model, and a positive integer
#'  specifies the number of observations to be used to fit the
#'  model. A model is computed using a random selection of data and
#'  the remaining data are used to validate model performance. The
#'  validation error measure is used as the optimization criterion.
#' @param loss The type of loss function used for optimization. Options
#' for models with a binary response are "\code{class}" for classification
#' error and "\code{auc}" for area under the curve. Options for models with a
#' continuous response are "\code{mse}" for mean squared error and
#' "\code{mae}" for mean absolute error. If the option "default" is selected,
#' or no loss is specified, the classification accuracy will be used for a binary
#' response model and the MSE will be use for models with a continuous
#' model.
#' @param cross If an integer k \> 1 is specified, k-fold cross-validation
#'  is used to fit the model. This method is very slow for large datasets.
#'  This parameter is ignored unless \code{fast = FALSE}.
#' @return Function returns an object of class "\code{eztune}" which contains
#' a summary of the tuning parameters for the best model, the best loss
#' measure achieved (classification accuracy, AUC, MSE, or MAE), and the best
#' model.
#' \item{loss}{Best loss measure obtained by the optimizer. This is
#' the measure specified by the user that the optimizer uses to choose a
#' "best" model (classification accuracy, AUC, MSE, or MAE). Note that
#' if the default option is used it is the classification
#' accuracy for a binary response and the MSE for a continuous response.}
#' \item{model}{Best model found by the optimizer. Adaboost model
#'  comes from package \code{ada} (\code{ada} object), elastic net model
#'  comes from package \code{glmnet} (\code{glmnet} object), gbm model
#'  comes from package \code{gbm} (\code{gbm.object} object), svm (\code{svm}
#'  object) model comes from package \code{e1071}.}
#' \item{n}{Number of observations used in model training when
#' fast option is used}
#' \item{nfold}{Number of folds used if cross validation is used
#' for optimization.}
#' \item{iter}{Tuning parameter for adaboost.}
#' \item{nu}{Tuning parameter for adaboost.}
#' \item{shrinkage}{Tuning parameter for adaboost and gbm.}
#' \item{lambda}{Tuning parameter for elastic net}
#' \item{alpha}{Tuning parameter for elastic net}
#' \item{n.trees}{Tuning parameter for gbm.}
#' \item{interaction.depth}{Tuning parameter for gbm.}
#' \item{n.minobsinnode}{Tuning parameter for gbm.}
#' \item{cost}{Tuning parameter for svm.}
#' \item{gamma}{Tuning parameter for svm.}
#' \item{epsilon}{Tuning parameter for svm regression.}
#' \item{levels}{If the model has a binary response, the levels of y are listed.}
#'
#' @examples
#' library(mlbench)
#' data(Sonar)
#' sonar <- Sonar[sample(1:nrow(Sonar), 100), ]
#'
#' y <- sonar[, 61]
#' x <- sonar[, 1:10]
#'
#' # Optimize an SVM using the default fast setting and Hooke-Jeeves
#' eztune(x, y)
#'
#' # Optimize an SVM with 3-fold cross validation and Hooke-Jeeves
#' eztune(x, y, fast = FALSE, cross = 3)
#'
#' # Optimize GBM using training set of 50 observations and Hooke-Jeeves
#' \donttest{eztune(x, y, method = "gbm", fast = 50, loss = "auc")}
#'
#' # Optimize SVM with 25% of the observations as a training dataset
#' # using a genetic algorithm
#' \donttest{eztune(x, y, method = "svm", optimizer = "ga", fast = 0.25)}
#'
#' @export
#'
eztune <- function(x, y, method = "svm", optimizer = "hjn", fast = TRUE,
                   cross = NULL, loss = "default") {

  nms <- colnames(x)

  if(length(unique(y)) == 2) {
    lev <- levels(as.factor(y))
    y <- as.numeric(as.factor(y)) - 1
    type <- "bin"
    if(loss == "default") loss = "class"
  } else {
    y <- as.numeric(as.character(y))
    type <- "reg"
    if(loss == "default") loss = "mse"
  }

  if(fast > 1) {
    fast <- round(fast)
  }

  if(!is.null(cross)) {
    cross <- round(cross)
  }


  command <- paste(type, method, optimizer, sep = ".")

  ezt <- switch(command,
                bin.ada.ga = ada.bin.ga(x, y, cross = cross, fast = fast, loss = loss),
                bin.ada.hjn = ada.bin.hjn(x, y, cross = cross, fast = fast, loss = loss),
                bin.gbm.ga = gbm.bin.ga(x, y, cross = cross, fast = fast, loss = loss),
                bin.gbm.hjn = gbm.bin.hjn(x, y, cross = cross, fast = fast, loss = loss),
                bin.svm.ga = svm.bin.ga(x, y, cross = cross, fast = fast, loss = loss),
                bin.svm.hjn = svm.bin.hjn(x, y, cross = cross, fast = fast, loss = loss),
                bin.en.ga = en.bin.ga(x, y, cross = cross, fast = fast, loss = loss),
                bin.en.hjn = en.bin.hjn(x, y, cross = cross, fast = fast, loss = loss),
                reg.gbm.ga = gbm.reg.ga(x, y, cross = cross, fast = fast, loss = loss),
                reg.gbm.hjn = gbm.reg.hjn(x, y, cross = cross, fast = fast, loss = loss),
                reg.svm.ga = svm.reg.ga(x, y, cross = cross, fast = fast, loss = loss),
                reg.svm.hjn = svm.reg.hjn(x, y, cross = cross, fast = fast, loss = loss),
                reg.en.ga = en.reg.ga(x, y, cross = cross, fast = fast, loss = loss),
                reg.en.hjn = en.reg.hjn(x, y, cross = cross, fast = fast, loss = loss)
  )

  ezt$variables <- nms

  if(grepl("bin.", command)) ezt$levels <- lev

  class(ezt) <- "eztune"

  ezt
}

Any scripts or data that you put into this service are public.

EZtune documentation built on Dec. 11, 2021, 9:33 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

EZtune
Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines

R/eztune.R
In EZtune: Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines

Defines functions eztune

Documented in eztune

Try the EZtune package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

EZtune Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines

R/eztune.R In EZtune: Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines

Defines functions eztune

Documented in eztune

Try the EZtune package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

EZtune
Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines

R/eztune.R
In EZtune: Tunes AdaBoost, Elastic Net, Support Vector Machines, and Gradient Boosting Machines