model agnostic breakDown plots for ranger

  collapse = TRUE,
  comment = "#>"

Here we will use the HR churn data ( to present the breakDown package for ranger models.

The data is in the breakDown package

head(HR_data, 3)

Now let's create a ranger classification forest for churn, the left variable.

HR_data$left <- factor(HR_data$left)
model <- ranger(left ~ ., data = HR_data, importance = 'impurity', probability=TRUE, min.node.size = 2000)

predict.function <- function(model, new_observation) predict(model, new_observation, type = "response")$predictions[,2]

predict.function(model, HR_data[11,])

But how to understand which factors drive predictions for a single observation?

With the breakDown package!

Explanations for the trees votings.


explain_1 <- broken(model, HR_data[11,-7], data = HR_data[,-7],
                    predict.function = predict.function, 
                    direction = "down")
plot(explain_1) + ggtitle("breakDown plot  (direction=down) for ranger model")

explain_2 <- broken(model, HR_data[11,-7], data = HR_data[,-7],
                    predict.function = predict.function, 
                    direction = "up")
plot(explain_2) + ggtitle("breakDown plot (direction=up) for ranger model")

Try the breakDown package in your browser

Any scripts or data that you put into this service are public.

breakDown documentation built on May 29, 2024, 10:37 a.m.