Description Usage Arguments Value Examples
This function computes partial dependency of a supervised machine learning model over a range of values for a single observation. Does not work for multiclass problems! Check predictor_xgb
to get an example of predictor
to use (so you can create your own).
1 2 3 4 |
model |
Type: unknown. The model to pass to |
predictor |
Type: function(model, data). The predictor function which takes a model and data as inputs, and return predictions. |
data |
Type: data.table (mandatory). The data we need to use to sample from for the partial dependency with |
observation |
Type: data.table (mandatory). The observation we want to get partial dependence from. It is mandatory to use a data.table to retain column names. |
column |
Type: character. The column we want partial dependence from. You can specify two or more |
accuracy |
Type: integer. The accuracy of the partial dependence from, exprimed as number of sampled points by percentile of the |
safeguard |
Type: logical. Whether to safeguard |
safeguard_val |
Type: integer. The maximum number of observations allowed when |
exact_only |
Type: logical. Whether to select only exact values for data sampling. Defaults to |
label_name |
Type: character. The column name given to the predicted values in the output table. Defaults to |
comparator_name |
Type: character. The column name given to the evolution value ( |
A list with different elements: grid_init
for the grid before expansion, grid_exp
for the expanded grid with predictions, preds
for the predictions, and obs
for the original prediction on observation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | ## Not run:
# Let's load a dummy dataset
data(mtcars)
setDT(mtcars) # Transform to data.table for easier manipulation
# We train a xgboost model on 31 observations, keep last to analyze later
set.seed(0)
xgboost_model <- xgboost(data = data.matrix(mtcars[-32, -1]),
label = mtcars$mpg[-32],
nrounds = 20)
# Perform partial dependence grid prediction to analyze the behavior of the 32th observation
# We want to check how it behaves with:
# => horsepower (hp)
# => number of cylinders (cyl)
# => transmission (am)
# => number of carburetors (carb)
preds_partial <- partial_dep.obs(model = xgboost_model,
predictor = predictor_xgb, # Default for xgboost
data = mtcars[-32, -1], # train data = 31 first observations
observation = mtcars[32, -1], # 32th observation to analyze
column = c("hp", "cyl", "am", "carb"),
accuracy = 20, # Up to 20 unique values per column
safeguard = TRUE, # Prevent high memory usage
safeguard_val = 1048576, # No more than 1048576 observations,
exact_only = TRUE, # Not allowing approximations,
label_name = "mpg", # Label is supposed "mpg"
comparator_name = "evo") # Comparator +/-/eq for analysis
# How many observations? 300
nrow(preds_partial$grid_exp)
# How many observations analyzed per column? hp=10, cyl=3, am=2, carb=5
summary(preds_partial$grid_init)
# When cyl decreases, mpg increases!
partial_dep.plot(grid_data = preds_partial$grid_exp,
backend = "tableplot",
label_name = "mpg",
comparator_name = "evo")
# Another way of plotting... hp/mpg relationship is not obvious
partial_dep.plot(grid_data = preds_partial$grid_exp,
backend = "car",
label_name = "mpg",
comparator_name = "evo")
# Do NOT do this on >1k samples, this will kill RStudio
# Histograms make it obvious when decrease/increase happens.
partial_dep.plot(grid_data = preds_partial$grid_exp,
backend = "plotly",
label_name = "mpg",
comparator_name = "evo")
# Get statistics to analyze fast
partial_dep.feature(preds_partial$grid_exp, metric = "emp", in_depth = FALSE)
# Get statistics to analyze, but is very slow when there is large data
# Note: unreliable for large amount of observations due to asymptotic infinites
partial_dep.feature(preds_partial$grid_exp, metric = "emp", in_depth = TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.