nano_pdp | R Documentation |
Creates partial dependency plots (PDPs) from h2o models stored i nano objects.
nano_pdp(
nano,
model_no = NA,
vars,
row_index = -1,
plot = TRUE,
save = FALSE,
subdir = NA,
file_type = "html"
)
model_no |
the positions of each model in the list of models in the nano object for which the PDP should be calculated. If not entered, the last model is taken by default. |
vars |
a character vector of variables to create PDPs off. |
row_index |
a numeric vector of dataset rows numbers to be used to calculate PDPs. To use entire dataset, set to -1. |
plot |
a logical specifying whether the variable importance should be plotted. |
save |
a logical specifying whether the plot should be saved into working directory. |
subdir |
sub directory of the working directory in which the plot should be saved. |
file_type |
file type in which the plots should be saved. Can take values |
Function first checks if the PDPs of the specified models have already
been calculated (by checking in the list nano$pdp
). If it has not been calculated,
then the required PDPs will be calculated and the relevant slot in nano$pdp
will be filled out.
If plot = TRUE
, a plot of the PDPs will also be returned. The plot can
be saved in a subfolder of the working directory by using the save
and subdir
arguments.
nano object with PDPs of specified models calculated. Also returns a
plot if plot = TRUE
and saves each of the plots if save = TRUE
..
## Not run:
if(interactive()){
library(h2o)
library(nano)
h2o.init()
# import dataset
data(property_prices)
train <- as.h2o(property_prices)
# set the response and predictors
response <- "sale_price"
var <- setdiff(colnames(property_prices), response)
# build grids
grid_1 <- h2o.grid(x = var,
y = response,
training_frame = train,
algorithm = "randomForest",
hyper_params = list(ntrees = 1:2),
nfolds = 3,
seed = 628)
grid_2 <- h2o.grid(x = var,
y = response,
training_frame = train,
algorithm = "randomForest",
hyper_params = list(ntrees = 3:4),
nfolds = 3,
seed = 628)
obj <- create_nano(grid = list(grid_1, grid_2),
data = list(property_prices), # since underlying dataset is the same
) # since model is not entered, will take best model from grids
# calculate PDP and save plots in working directory
obj <- nano_pdp(nano = obj, model_no = 1:2, vars <- c("lot_size", "income"),
plot = TRUE, save = TRUE)
}
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.