#| child: aaa.Rmd
#| include: false

r descr_models("boost_tree", "h2o")

Tuning Parameters

#| label: h2o-param-info
#| echo: false
defaults <- 
  tibble::tibble(parsnip = c("mtry", "trees", "tree_depth", "learn_rate", "sample_size", "min_n",  "loss_reduction", "stop_iter"),
                 default = c(1, 50L, 6, 0.3, 1, 1, 0, 0))

# For this model, this is the same for all modes
param <-
 boost_tree() |> 
  set_engine("h2o") |> 
  set_mode("regression") |> 
  make_parameter_list(defaults)

This model has r nrow(param) tuning parameters:

#| label: h2o-param-list
#| echo: false
#| results: asis
param$item

min_n represents the fewest allowed observations in a terminal node, [h2o::h2o.xgboost()] allows only one row in a leaf by default.

stop_iter controls early stopping rounds based on the convergence of the engine parameter stopping_metric. By default, [h2o::h2o.xgboost()] does not use early stopping. When stop_iter is not 0, [h2o::h2o.xgboost()] uses logloss for classification, deviance for regression and anonomaly score for Isolation Forest. This is mostly useful when used alongside the engine parameter validation, which is the proportion of train-validation split, parsnip will split and pass the two data frames to h2o. Then [h2o::h2o.xgboost()] will evaluate the metric and early stopping criteria on the validation set.

Translation from parsnip to the original package (regression)

[agua::h2o_train_xgboost()] is a wrapper around [h2o::h2o.xgboost()].

r uses_extension("boost_tree", "h2o", "regression")

#| label: h2o-reg
boost_tree(
  mtry = integer(), trees = integer(), tree_depth = integer(), 
  learn_rate = numeric(), min_n = integer(), loss_reduction = numeric(), stop_iter = integer()
) |>
  set_engine("h2o") |>
  set_mode("regression") |>
  translate()

Translation from parsnip to the original package (classification)

r uses_extension("boost_tree", "h2o", "classification")

#| label: h2o-cls
boost_tree(
  mtry = integer(), trees = integer(), tree_depth = integer(), 
  learn_rate = numeric(), min_n = integer(), loss_reduction = numeric(), stop_iter = integer()
) |> 
  set_engine("h2o") |> 
  set_mode("classification") |> 
  translate()

Preprocessing

#| child: template-tree-split-factors.Rmd

Non-numeric predictors (i.e., factors) are internally converted to numeric. In the classification context, non-numeric outcomes (i.e., factors) are also internally converted to numeric.

Interpreting mtry

#| child: template-mtry-prop.Rmd

Initializing h2o

#| child: template-h2o-init.Rmd

Saving fitted model objects

#| child: template-bundle.Rmd


Try the parsnip package in your browser

Any scripts or data that you put into this service are public.

parsnip documentation built on June 8, 2025, 12:10 p.m.