#| child: aaa.Rmd
#| include: false

r descr_models("rand_forest", "h2o")

Tuning Parameters

#| label: h2o-param-info
#| echo: false
defaults <- 
  tibble::tibble(parsnip = c("mtry", "trees", "min_n"),
                 default = c("see below", "50L", 1))

param <-
  rand_forest() |> 
  set_engine("h2o") |> 
  make_parameter_list(defaults)

This model has r nrow(param) tuning parameters:

#| label: h2o-param-list
#| echo: false
#| results: asis
param$item

mtry depends on the number of columns and the model mode. The default in [h2o::h2o.randomForest()] is floor(sqrt(ncol(x))) for classification and floor(ncol(x)/3) for regression.

Translation from parsnip to the original package (regression)

[agua::h2o_train_rf()] is a wrapper around [h2o::h2o.randomForest()].

#| label: h2o-reg
rand_forest(
  mtry = integer(1),
  trees = integer(1),
  min_n = integer(1)
) |>  
  set_engine("h2o") |> 
  set_mode("regression") |> 
  translate()

min_rows() and min_cols() will adjust the number of neighbors if the chosen value if it is not consistent with the actual data dimensions.

Translation from parsnip to the original package (classification)

#| label: h2o-cls
rand_forest(
  mtry = integer(1),
  trees = integer(1),
  min_n = integer(1)
) |> 
  set_engine("h2o") |> 
  set_mode("classification") |> 
  translate()

Preprocessing requirements

#| child: template-tree-split-factors.Rmd

Initializing h2o

#| child: template-h2o-init.Rmd

Saving fitted model objects

#| child: template-bundle.Rmd


Try the parsnip package in your browser

Any scripts or data that you put into this service are public.

parsnip documentation built on June 8, 2025, 12:10 p.m.