knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The explore package offers a simplified way to use machine learning to understand and explain patterns in the data.
explain_tree()
creates a decision tree. The target can be binary, categorical or numericalexplain_forest()
creates a random forest. The target can be binary, categorical or numericalexplain_xgboost()
creates a random forest. The target must be binary (0/1, FALSE/TRUE)explain_logreg()
creates a logistic regression. The target must be binarybalance_target()
to balance a targetweight_target()
to create weights for the decision treeWe use synthetic data in this example
library(dplyr) library(explore) data <- create_data_buy(obs = 1000) glimpse(data)
data %>% explain_tree(target = buy)
data %>% explain_tree(target = mobiledata_prd)
data %>% explain_tree(target = age)
data %>% explain_forest(target = buy, ntree = 100)
To get the model itself as output you can use the parameter out = "model
or out = all
to get all (feature importance as plot and table, trained model). To use the model for a prediction, you can use predict_target()
As XGBoost only accepts numeric variables, we use drop_var_not_numeric()
to drop mobile_data_prd
as it is not a numeric variable. An alternative would be to convert the non numeric variables into numeric.
data %>% drop_var_not_numeric() |> explain_xgboost(target = buy)
Use parameter out = "all"
to get more details about the training
train <- data %>% drop_var_not_numeric() |> explain_xgboost(target = buy, out = "all")
train$importance
train$tune_plot
train$tune_data
To use the model for a prediction, you can use predict_target()
data %>% explain_logreg(target = buy)
If you have a data set with a very unbalanced target (in this case only 5% of all observations have buy == 1
) it may be difficult to create a decision tree.
data <- create_data_buy(obs = 2000, target1_prob = 0.05) data %>% describe(buy)
It may help to balance the target before growing the decision tree (or use weighs as alternative). In this example we down sample the data so buy has 10% of target == 1
.
data %>% balance_target(target = buy, min_prop = 0.10) %>% explain_tree(target = buy)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.