An Introduction to OptimClassifier

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(OptimClassifier)
library(ggplot2)

OptimClassifier explained in one Minute 🕛

OptimClassifier provides a set of tools for creating models, selecting the best parameters combination for a model, and select the best threshold for your binary classification. The package contains tools for:

Take a quick look at functions

The main function could summarize in this table:

Optim.
Method
Threshold optimization ✖️ ✖️
Parameter Optimization
What parameter or option? Transformations Family & Links Random variable Linear or Quadratic CP Hidden layers Kernels

*These models are natively classifiers.

Installation

Install this package from CRAN (stable version):

install.packages("OptimClassifier")

Install this package from Github (development version):

For this, you can choose different packages such as:

With devtools
library(devtools)
install_github("economistgame/OptimClassifier")
With remotes
library(remotes)
install_github("economistgame/OptimClassifier")

A simple example

The example shows you how to solve a common credit scoring problem with this package and GLM methodology.

Firstly, we must load the dataset. In this example, we use Australian Credit.

## Load a Dataset
data(AustralianCredit)

Then we create a model with the Optim.GLM function (or the one you want).

## Create the model
creditscoring <- Optim.GLM(Y~., AustralianCredit, p = 0.7, seed=2018)

Now you can print the results of the models

### See a ranking of the models tested
print(creditscoring)

Are you see with a graphic? Try to typping plot(creditscoring)

### Are you bored of R outputs?? Try to plot
plot(creditscoring)

But what is the information (coefficients and others things) of the best model? And the secondth in the rank list?. Simply we can see:

### Access to summary of the best model
summary(creditscoring)
### Access to summary of the secondth model
summary(creditscoring,2)

Frequently Answers Questions (FAQs)

What is optimization?

Optimization is the process of modifying your parameters on your train model to improve the quality of your classification model. Based on your goals, optimization can involve ad implementation improvements or changes to your classification model. This package is focused in two questions, the threshold and several options.

Why optimize your classification model?

Optimizing your classification model is important when you want to completely achieve their potential. Through optimization, you can help improve the root mean square error (RMSE), grow the success rate, or accomplish others of your other goals (minimizing type I error or minimizing type II error).

How does Optim.LM optimize a Linear Model?

Optim.LM makes transformations of the response variable to improve the precision of the linear model. Then the function searches the best threshold to obtain the best result as possible to your goal.

Transformation included:

How does Optim.GLM optimize a Generalized Linear Model?

Optim.GLM tries to change around different types of error distributions (it called family in R) and several transformations of data (it called link in R). Then the function searches the best threshold to obtain the best result as possible to your goal.

Models trained with this functions:

How does Optim.LMM optimize a Linear Mixed Model?

Optim.LMM searches which one of the variables can use as a random variable improving the model precision. Then the function searches the best threshold to obtain the best result as possible to your goal.

How does Optim.DA optimize a Discriminant Analysis?

Optim.DA tries to train a Quadratic and Linear Discriminant Analysis because sometimes it does not possible trains a QDA for data characteristics.

How does Optim.CART optimize a Decision Tree?

Optim.CART focuses on the pruning progress and compares several levels of pruning, for this progress uses a complexity parameter that It is the amount by which splitting that node improved the relative error.

How does Optim.NN optimize a Neural Network?

Optim.NN searches which the number of hidden layers improves the model precision. Then the function searches the best threshold to obtain the best result as possible to your goal.

How does Optim.SVM optimize a Support Vector Machine?

Optim.SVM tries to change around different types of kernels to improve the precision.Then the function searches the best threshold to obtain the best result as possible to your goal.

Kernels trained with this functions:

Bugs and feature requests

If you find problems with the package, or there's anything that it doesn't do which you think it should, please submit them to https://github.com/economistgame/OptimClassifier/issues. In particular, let me know about optimizers and formats which you'd like supported, or if you have a workflow which might make sense for inclusion as a default convenience function.



Try the OptimClassifier package in your browser

Any scripts or data that you put into this service are public.

OptimClassifier documentation built on Jan. 14, 2020, 5:10 p.m.