nano_ice: Create ICE

nano_iceR Documentation

Create ICE

Description

Creates partial dependency plots (PDPs) from h2o models stored i nano objects.

Usage

nano_ice(
  nano,
  model_no = NA,
  vars,
  quantiles = seq(0, 1, 0.1),
  max_levels = 30,
  target = NULL,
  plot = TRUE,
  subtitle = NA,
  save = FALSE,
  subdir = NA,
  file_name
)

Arguments

model_no

the positions of each model in the list of models in the nano object for which the PDP should be calculated. If not entered, the last model is taken by default.

vars

a character vector of variables to create PDPs off.

plot

a logical specifying whether the variable importance should be plotted.

subtitle

subtitle for the plot.

save

a logical specifying whether the plot should be saved into working directory.

subdir

sub directory in which the plot should be saved.

file_name

file name of the saved plot.

row_index

a numeric vector of dataset rows numbers to be used to calculate PDPs. To use entire dataset, set to -1.

Details

Function first checks if the variable importance of the specified model has already been calculated (by checking in the list nano$varimp). If it has not been calculated, then the variable importance will be calculated and the relevant slot in nano$varimp will be filled out.

If plot = TRUE, a plot of the variable importance will also be returned. The plot can be saved in a subfolder of the working directory by using the save and subdir arguments.

Value

nano object with variable importance of specified models calculated. Also returns a plot if plot = TRUE.

Examples

## Not run: 
if(interactive()){
 library(h2o)
 library(nano)
 
 h2o.init()
 
 # import dataset
 data(property_prices)
 train <- as.h2o(property_prices)
 
 # set the response and predictors
 response <- "sale_price"
 var <- setdiff(colnames(property_prices), response)
 
 # build grids
 grid_1 <- h2o.grid(x               = var,
                    y               = response,
                    training_frame  = train,
                    algorithm       = "randomForest",
                    hyper_params    = list(ntrees = 1:2),
                    nfolds          = 3,
                    seed            = 628)

 grid_2 <- h2o.grid(x               = var,
                    y               = response,
                    training_frame  = train,
                    algorithm       = "randomForest",
                    hyper_params    = list(ntrees = 3:4),
                    nfolds          = 3,
                    seed            = 628)

 
 obj <- create_nano(grid = list(grid_1, grid_2),
                    data = list(property_prices), # since underlying dataset is the same 
                    ) # since model is not entered, will take best model from grids
 
 # calculate ICE
 obj <- nano_ice(nano = obj, model_no = 1:2, vars <- c("lot_size", "income"))
 
 }

## End(Not run)

Nanoputian628/nano documentation built on Oct. 30, 2023, 3:28 p.m.