bootImportance: Bootstrap Variable Importance And Averaged Grid Variable...

View source: R/bootImportance.R

bootImportanceR Documentation

Bootstrap Variable Importance And Averaged Grid Variable Importance

Description

Evaluates variable importance as well as bootstrapped variable importance for a single model or a grid of models

Usage

bootImportance(model, df, metric, n = 100)

Arguments

model

a model or a model grid of models trained by h2o machine learning software

df

dataset for testing the model. if "n" is bigger than 1, this dataset will be used for drawing bootstrap samples. otherwise (default), the entire dataset will be used for evaluating the model

metric

character. model evaluation metric to be passed to boot R package. this could be, for example "AUC", "AUCPR", RMSE", etc., depending of the model you have trained. all evaluation metrics provided for your H2O models can be specified here.

n

number of bootstraps

Value

list of mean perforance of the specified metric and other bootstrap results

Author(s)

E. F. Haghish

Examples


## Not run: 
library(h2o)
h2o.init(ignore_config = TRUE, nthreads = 2, bind_to_localhost = FALSE, insecure = TRUE)
prostate_path <- system.file("extdata", "prostate.csv", package = "h2o")
df <- read.csv(prostate_path)

# prepare the dataset for analysis before converting it to h2o frame.
df$CAPSULE <- as.factor(df$CAPSULE)

# convert the dataframe to H2OFrame and run the analysis
prostate.hex <- as.h2o(df)
aml <- h2o.automl(y = "CAPSULE", training_frame = prostate.hex, max_runtime_secs = 30)

# evaluate the model performance
perf <- h2o.performance(aml@leader, xval = TRUE)

# evaluate bootstrap performance for the training dataset
#    NOTE that the raw data is given not the 'H2OFrame'
perf <- bootPerformance(model = aml@leader, df = df, metric = "RMSE", n = 500)

## End(Not run)

h2otools documentation built on April 4, 2025, 2:33 a.m.