knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Here we will use the HR churn data (https://www.kaggle.com/) to present the breakDown package for glm models.

The data is in the breakDown package

library(breakDown)
head(HR_data, 3)

Now let's create a logistic regression model for churn, the left variable.

model <- glm(left~., data = HR_data, family = "binomial")

But how to understand which factors drive predictions for a single observation?

With the breakDown package!

Explanations for the linear predictor.

library(ggplot2)
predict(model, HR_data[11,], type = "link")

explain_1 <- broken(model, HR_data[11,])
explain_1
plot(explain_1) + ggtitle("breakDown plot for linear predictors")

Explanations for the probability with intercept set as an origin.

predict(model, HR_data[11,], type = "response")

explain_1 <- broken(model, HR_data[11,], baseline = "intercept")
explain_1
plot(explain_1, 
     trans = function(x) exp(x)/(1+exp(x))) + ggtitle("Predicted probability of leaving the company")+ scale_y_continuous( limits = c(0,1), name = "probability", expand = c(0,0))


pbiecek/breakDown documentation built on March 15, 2024, 10:46 a.m.