README.md
In AlexandreRives/RegLog-EDA: Logistic Regression

LogRegEDA

The LogRegEDA is a package which give you an easy way to instantiate the logistic regression using the stochastic gradient descent with four modes :

Batch mode
Mini-batch mode
Online mode
Batch parallel mode

Using this package will allow you to fit, to predict, to see your result and to see an interactive plot of the loss function.

devtools::install_github("AlexandreRives/RegLog-EDA", dependencies = TRUE)

First of all, you have to load the package using the library() command in your console :

library(LogRegEDA)

Then, you need to import a dataset. Don't worry, we had your back and you can use the one we put in the package if you first want how the package works :

breast <- LogRegEDA::breast

Great ! Now you can see on the right that the dataset is loaded. Make a little check with the head() function in your console to confirm that the dataset is well loaded.

Let's split the dataset in a train and a test set :

breast_train = breast[1:500,]
breast_test = breast[501:699,]

Now you can be focused on our package.

Now that everything is loaded and that you have splited your dataset in a train-test. We can launch the fit function. As we said, we have coded 3 different modes. Let's start with the batch :

Below, an example on how you have to use it :

fit <- fit_reg_log(formula = class ~ ., data = breast_train, mode = "batch", normalize = TRUE, learning_rate = 0.1, max_iter = 1000, cores = 1, graph = FALSE, epsilon = 1e-05)

One option can be used if you want to scale your dataset : normalize : FALSE by default, TRUE if you want to scale your data. graph : FALSE by default, if you want to plot the graph during the gradient descent, turn it on TRUE.

Using the $ after the fit object can give you an access to 3 lists : - the coefficients - the costs - the residuals

If you want to see the result of your training set, use the print() or the summary() functions.

Here the summary function :

summary(fit)

###############################################################################################################

Results of the logistic regression : 
Call : 
fit_reg_log(formula = class ~ ., data = breast_train, mode = "batch", 
    normalize = TRUE, learning_rate = 0.1, max_iter = 1000, cores = 1, 
    graph = FALSE, epsilon = 1e-05)

Deviance Residuals :
      Min      Q1  Median     Q3 Max   Mean
1 -0.9975 -0.0274 -0.0097 0.0037   1 -7e-04

Coefficients & Features : 
  (Intercept)    clump  ucellsize ucellshape mgadhesion     sepics   bnuclei bchromatin   normnucl    mitoses
1    1.034724 -1.08406 -0.6294483 -0.7943087 -0.4462591 -0.3586972 -1.263574 -0.6778172 -0.4272717 -0.1065407

Degrees of Freedom : 499 Total (- intercept); 490 Residual
Deviance :  4570.997
AIC : 4590.997

Execution time : 0.7300706 sec.

###############################################################################################################

Then the print function :

print(fit)

###############################################################################################################

Results of the logistic regression : 
Call : 
fit_reg_log(formula = class ~ ., data = breast_train, mode = "batch", 
    normalize = TRUE, learning_rate = 0.1, max_iter = 1000, cores = 1, 
    graph = FALSE, epsilon = 1e-05)

Coefficients & Features : 
  (Intercept)    clump  ucellsize ucellshape mgadhesion     sepics   bnuclei bchromatin   normnucl    mitoses
1    1.034724 -1.08406 -0.6294483 -0.7943087 -0.4462591 -0.3586972 -1.263574 -0.6778172 -0.4272717 -0.1065407

Degrees of Freedom : 499 Total (- intercept); 490 Residual
Deviance :  4570.997
AIC :  4590.997

Execution time : 0.7300706 sec.

###############################################################################################################

As the batch mode, you have to use the fit_reg_log function but 2 things are changing :

fit <- fit_reg_log(formula = class ~ ., data = breast_train, mode = "mini_batch", batch_size = 10, learning_rate = 0.01, max_iter = 100, graph = FALSE, epsilon = 0.0001)

You have to change the mode and add the batch_size. You have obviously an access to the print() and summary() functions too and to the elements using the $ after the fitted object.

As the batch mode, you have to use the fit_reg_log function :

fit <- fit_reg_log(formula = class ~ ., data = breast_train, mode = "online", learning_rate = 0.01, max_iter = 100, graph = FALSE, epsilon = 0.0001)

You just have to change the mode and set it "online" to train your dataset. the print() and summary() functions can give you details about your fitted object.

As the batch mode, you have to use the fit_reg_log function :

fit <- fit_reg_log(formula = class ~ ., data = breast_train, mode = "batch_parallel", learning_rate = 0.01, max_iter = 100, cores = 4, graph = FALSE, epsilon = 0.0001)

Instead of using the "batch" mode, change the it to "batch_parallel" to train your dataset. We advise you to use it if you are planning to fit a big dataset. Set the number of cores you want to use. If you don't know how many cores your computer has, just put -1 and it will get all the cores minus 1 to let you use your computer while your data are in the fit function.

Once you have fitted your data, you can go for the predict function.

The predict function allows you to get the probability or the class predicted.

predict <- predict_reg_log(object = fit, newdata = breast_test, type = "posterior")

predict <- predict_reg_log(object = fit, newdata = breast_test, type = "class")

By using a print() on your predict object, you will see a tab with all your predict probability or the class predicted. You will be able to make a confusion matrix or to compute the MSE.

To complete the package, we have provided 2 functions that allows you to encode your dataset :

The first one encode your qualitative data, don't worry and put all your features inside :

dummies(x = X_data)

The second one encode your target data :

dummies_y(y = target)

AlexandreRives/RegLog-EDA documentation built on Dec. 17, 2021, 7:50 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AlexandreRives/RegLog-EDA
Logistic Regression

README.md
In AlexandreRives/RegLog-EDA: Logistic Regression

LogRegEDA

Installing the package

How to use the package ?

Loading the package

Importing the dataset

Fit function

Batch mode

Mini-batch mode

Online mode

Batch parallel mode

Predict function

Posterior type

Class type

Diverse functions

R Package Documentation

Browse R Packages

We want your feedback!

AlexandreRives/RegLog-EDA Logistic Regression

README.md In AlexandreRives/RegLog-EDA: Logistic Regression

LogRegEDA

Installing the package

How to use the package ?

Loading the package

Importing the dataset

Fit function

Batch mode

Mini-batch mode

Online mode

Batch parallel mode

Predict function

Posterior type

Class type

Diverse functions

R Package Documentation

Browse R Packages

We want your feedback!

AlexandreRives/RegLog-EDA
Logistic Regression

README.md
In AlexandreRives/RegLog-EDA: Logistic Regression