single_global_effect: Global Interpretation for a Single Model

single_global_effectR Documentation

Global Interpretation for a Single Model

Description

Creates ALEs, PDPs and ICEs for variable(s) for a single h2o model.

Usage

single_global_effect(
  model,
  data,
  vars,
  max_levels = 30,
  method = "ale",
  quantiles = seq(0, 1, 0.1)
)

Arguments

model

a h2o model.

data

a dataset. Dataset used to create model.

vars

a list of character strings. Elements are variables to calculate global effects of. For "ale" and "pdp" method, able to calculate two-way variable interaction global effect. To specify two-way interaction, enter pair of variables as a vector in the list. For example: list(v1, c(v2, v3), v4) will calculate the global effects for v1 and v4, and then the two-way effect of v2 and v3.

max_levels

a numeric. Maximum number of unique levels to calculate pdp for each variable.

method

a character. Takes the value "ale", "pdp" or "ice".

quantiles

a numeric vector of quantiles (numbers from 0 to 1) for each ICE to be calculated for. Only valid when method = "ice".

Details

The "ale" and "pdp" method is implement using the iml package. The main advantage of the iml package is that it is extremely robust and has one of the fastest algorithmns for computing global effects. Further, it is one of the few packages while is able to calculate two-way variable interactions for ALEs and PDPs. For more details, see iml::FeatureEffect.

The "ice" method is implemented by using h2o's built in h2o.ice function. The reason why this is preferred over iml's implementation is that h2o.ice allows flexibility for which rows to calculate the ICE for. In the iml package, the ICE is calculated for all rows which can become computationally intensive for large dataset. For more details, see h2o::h2o.ice.

When modelling using the nano package, it is recommended to instead use the nano_global_effect function. This is a wrapper for a series of functions which calculates global effects. It is able to calculate the global effects directly from a nano object, for both single and multiple models, and has the option to return various plots.

Value

a data.tables containing values for each variable combined together.

Examples

## Not run: 
if(interactive()){
 library(h2o)
 library(nano)
 
 h2o.init()
 
 # import dataset
 data(property_prices)
 train <- as.h2o(property_prices)
 
 # set the response and predictors
 response <- "sale_price"
 var <- setdiff(colnames(property_prices), response)
 
 # build model
 grid <- h2o.grid(x               = var,
                  y               = response,
                  training_frame  = train,
                  algorithm       = "randomForest",
                  hyper_params    = list(ntrees = 1:2),
                  nfolds          = 3,
                  seed            = 628)
 model <- h2o.getModel(grid@model_ids[[1]])
 
 # calculate pdp
 single_global_effect(model, property_prices, c("lot_size"))
 
 }

## End(Not run)

Nanoputian628/nano documentation built on Oct. 30, 2023, 3:28 p.m.