h2o.permutation_importance: Calculate Permutation Feature Importance.
In h2o: R Interface for the 'H2O' Scalable Machine Learning Platform

h2o.permutation_importance

R Documentation

Calculate Permutation Feature Importance.

Description

When n_repeats == 1, the result is similar to the one from h2o.varimp(), i.e., it contains the following columns "Relative Importance", "Scaled Importance", and "Percentage".

Usage

h2o.permutation_importance(
  object,
  newdata,
  metric = c("AUTO", "AUC", "MAE", "MSE", "RMSE", "logloss", "mean_per_class_error",
    "PR_AUC"),
  n_samples = 10000,
  n_repeats = 1,
  features = NULL,
  seed = -1
)

Arguments

`object`	A trained supervised H2O model.
`newdata`	Training frame of the model which is going to be permuted
`metric`	Metric to be used. One of "AUTO", "AUC", "MAE", "MSE", "RMSE", "logloss", "mean_per_class_error", "PR_AUC". Defaults to "AUTO".
`n_samples`	Number of samples to be evaluated. Use -1 to use the whole dataset. Defaults to 10 000.
`n_repeats`	Number of repeated evaluations. Defaults to 1.
`features`	Character vector of features to include in the permutation importance. Use NULL to include all.
`seed`	Seed for the random generator. Use -1 to pick a random seed. Defaults to -1.

Details

When n_repeats > 1, the individual columns correspond to the permutation variable importance values from individual runs which corresponds to the "Relative Importance" and also to the distance between the original prediction error and prediction error using a frame with a given feature permuted.

Value

H2OTable with variable importance.

Examples

## Not run: 
library(h2o)
h2o.init()
prostate_path <- system.file("extdata", "prostate.csv", package = "h2o")
prostate <- h2o.importFile(prostate_path)
prostate[, 2] <- as.factor(prostate[, 2])
model <- h2o.gbm(x = 3:9, y = 2, training_frame = prostate, distribution = "bernoulli")
h2o.permutation_importance(model, prostate)

## End(Not run)

h2o documentation built on May 29, 2024, 4:26 a.m.