#| child: aaa.Rmd
#| include: false

r descr_models("rand_forest", "grf")

Tuning Parameters

#| label: grf-param-info
#| echo: false
defaults <- 
  tibble::tibble(parsnip = c("mtry", "trees", "min_n"),
                 default = c("see below", "2000L", "5L"))

param <-
  rand_forest() |> 
  set_engine("grf") |> 
  make_parameter_list(defaults)

This model has r nrow(param) tuning parameters:

#| label: grf-param-list
#| echo: false
#| results: asis
param$item

mtry depends on the number of columns. If there are p predictors, the default value of mtry is min(ceiling(sqrt(p) + 20), p).

Translation from parsnip to the original package (regression)

See ?regression_forest

#| label: grf-reg
rand_forest(
  mtry = integer(1),
  trees = integer(1),
  min_n = integer(1)
) |>  
  set_engine("grf") |> 
  set_mode("regression") |> 
  translate()

Translation from parsnip to the original package (classification)

See ?probability_forest

#| label: grf-cls
rand_forest(
  mtry = integer(1),
  trees = integer(1),
  min_n = integer(1)
) |> 
  set_engine("grf") |> 
  set_mode("classification") |> 
  translate()

Translation from parsnip to the original package (quantile regression)

See ?quantile_forest

When specifying any quantile regression model, the user must specify the quantile levels a priori.

#| label: grf-quant
rand_forest(
  mtry = integer(1),
  trees = integer(1),
  min_n = integer(1)
) |> 
  set_engine("grf") |> 
  set_mode("quantile regression", quantile_levels = (1:3) / 4) |> 
  translate()

Preprocessing requirements

This method does require qualitative predictors to be converted to a numeric format (manually). When using parsnip, a one-hot encoding is automatically used to do this.

If there are missing values in the predictors, the model will use case-wise deletion to remove them.

Other notes

By default, parallel processing is turned off. When tuning, it is more efficient to parallelize over the resamples and tuning parameters. To parallelize the construction of the trees within the grf model, change the num.threads argument via [set_engine()].

For grf confidence intervals, the intervals are constructed using the form estimate +/- z * std_error. For classification probabilities, these values can fall outside of [0, 1] and will be coerced to be in this range.

Case weights

The regression and classification models enable the use of case weights. The quantile regression mode does not.

Examples

The "Fitting and Predicting with parsnip" article contains examples for rand_forest() with the "grf" engine.

References

Athey, Susan, Julie Tibshirani, and Stefan Wager. "Generalized Random Forests". Annals of Statistics, 47(2), 2019.



Try the parsnip package in your browser

Any scripts or data that you put into this service are public.

parsnip documentation built on Jan. 11, 2026, 9:06 a.m.