knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
When creating binary classification models, you must:
+ identify your desired machine learning models + tune the parameters for each model + train each model on your training data + test the models on your testing data + compare the results to find the best model
bestclassifier facilitates this complex, arduous process by allowing you to complete all of those tasks in one function
+ This function supports eight elite machine learning binary classification models, including: - logistic regression - lasso regression - random forest - extreme gradient boosting - support vector machine - artificial neural network - latent dirichlet allocation - k nearest neighbors
In order to explore the best.classifier function, we will use the CCD dataset. This dataset contains default status and payment information for all credit card customers transacting with a Taiwanese bank in 2005.
CCD <- bestclassifier::CCD str(CCD)
In the example below, I am seeking the machine learning model that produces the highest AUC when classifying credit card default. These models will be predicting the "Default" category in the Class variable by using all of the predictors in the dataset. Because the CCD data contains nearly 30,000 observations, I am training the model on 1% of the training dataset for fast results.
library(bestclassifier) bestclassifier(data = CCD, form = default.payment.next.month ~ ., p = 0.7, method = "repeatedcv", number = 5, repeats = 1, tuneLength = 5, positive ="Default", model = c("log_reg", "lasso", "lda", "svm", "lda", "knn", "ann", "xgboost"), set_seed = 1234, subset_train = .01, desired_metric = "ROC")
According to the bar graph, the lasso regression model performed the best on the training data, depicting an AUC of .6495.
Random Forest results on testing data:
+ Accuracy: 78.2% + Sensitivity: 3.1% + Specificity: 99.6% + Positive Predictive Value: 67.8% + Negative Predictive Value: 78.3%
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.