README.md

Identify Best Binary Classification Model

Overview: bestclassifier

The bestclassifier function was created in order to simplify the process of identifying the best binary classification model for a given dataset. While this process generally involves an arduous process of finding all of the relevant binary classification models, tuning them, training them, and then comparing them to find the best result, the bestclassifier function combines all of these steps into one user-friendly function.

Installation

To download this package use the following code:

if (!require(devtools)) {
install.packages("devtools") }
devtools::install_github("sross15/bestclassifier", build_vignettes = TRUE)

The source code is available here.

Usage

The bestclassifier function trains as many as eight machine learning binary classification models in order to identify the best predictive model for a given dataset. The available models include:

Once identifying the best machine learning model, the bestclassifier function will:

Example

Below is an example of the bestclassifier function used on the CCD dataset. Because this dataset contains nearly 30,000 observations, I used only 1% of the training data to build the binary classification models.

library(bestclassifier)

bestclassifier(data = CCD, form = default.payment.next.month ~ ., p = 0.7, 
              method = "repeatedcv", number = 5, repeats = 1, tuneLength = 5, 
              positive ="Default", model = c("log_reg", "lasso", "rf"), 
              set_seed = 1234, subset_train = .01, desired_metric = "ROC")

Getting Help

If you find a clear bug, please file a minimal reproducible example on github. For questions and other comments please use community.rstudio.com.



sross15/bestclassifier documentation built on May 23, 2019, 7:19 a.m.