README.md
In sross15/bestclassifier: Identify Best Binary Classification Model

Identify Best Binary Classification Model

The bestclassifier function was created in order to simplify the process of identifying the best binary classification model for a given dataset. While this process generally involves an arduous process of finding all of the relevant binary classification models, tuning them, training them, and then comparing them to find the best result, the bestclassifier function combines all of these steps into one user-friendly function.

To download this package use the following code:

if (!require(devtools)) {
install.packages("devtools") }
devtools::install_github("sross15/bestclassifier", build_vignettes = TRUE)

The source code is available here.

The bestclassifier function trains as many as eight machine learning binary classification models in order to identify the best predictive model for a given dataset. The available models include:

logistic regression
lasso regression
random forest
extreme gradient boosting
support vector machine
artificial neural network
latent dirichlet allocation
k nearest neighbors

Once identifying the best machine learning model, the bestclassifier function will:

print a bar graph depicting the performance of each model on the training data
print the name of the best binary classification model along with its predictive performance score (either AUC or Accuracy depending upon what the user selects)
employ the best trained model on an unseen testing data and return a confusion matrix with overall performance results

Below is an example of the bestclassifier function used on the CCD dataset. Because this dataset contains nearly 30,000 observations, I used only 1% of the training data to build the binary classification models.

library(bestclassifier)

bestclassifier(data = CCD, form = default.payment.next.month ~ ., p = 0.7, 
              method = "repeatedcv", number = 5, repeats = 1, tuneLength = 5, 
              positive ="Default", model = c("log_reg", "lasso", "rf"), 
              set_seed = 1234, subset_train = .01, desired_metric = "ROC")

If you find a clear bug, please file a minimal reproducible example on github. For questions and other comments please use community.rstudio.com.

sross15/bestclassifier documentation built on May 23, 2019, 7:19 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com