calculate_partial_dependency: Calculate partial dependency for list of models

Description Usage Arguments Value Examples

Description

Given a list of models, get their average prediction over a range of values from a specified feature. This is simply calculated by taking the average of the output from calculate_ice for each model and value cutpoint.

Usage

1
2
3
4
calculate_partial_dependency(feature_dt, feature_col, model_list,
  num_grid = 10, custom_range = NULL, predict_fcn = predict,
  ensemble_colname = "ensemble", ensemble_fcn = median,
  ensemble_models = names(model_list))

Arguments

feature_dt

data.table containing features used in predictive model

feature_col

character. name of a column in feature_dt

model_list

named list of model objects. Each name will become a column containing predictions from that model.

num_grid

number of points to distribute along range of feature_col or custom_range

custom_range

should only be used if feature_cols is a 1-element vector Defines a custom range to calculate partial dependency over. This can be a 2-element numerical vector or a character vector, depending on the type of feature_cols[1]

predict_fcn

function that accepts a model as its first argument and newdata as one of its named arguments

ensemble_colname

character. Name of the column containing ensemble predictions

ensemble_fcn

function that combines a vector of predictions into a single ensemble. Default is median

ensemble_models

character vector of names from model_list. These models will be combined by ensemble_fcn to form the ensemble

Value

Output is a data.table with the columns feature, feature_val, model, and prediction

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
dt <- data.table(a = 1:3, b = 2:4, c = c(8, 11, 14))
m <- lm(c ~ a + b - 1, dt)
gm <- glm(c ~ a + b - 1, data = dt)
calculate_partial_dependency(dt, "a", list(lm1 = m),
                             num_grid = 6)
calculate_partial_dependency(dt, "a", list(lm1 = m, glm1 = gm),
                             num_grid = 6, ensemble_fcn = sum)
calculate_partial_dependency(dt, "a", list(lm1 = m),
                             num_grid = 6, custom_range = c(1,6))

## End(Not run)

breather/brightbox documentation built on May 13, 2019, 5:04 a.m.