man/rmd/decision_tree_spark.md

For this engine, there are multiple modes: classification and regression

Tuning Parameters

This model has 2 tuning parameters:

Translation from parsnip to the original package (classification)

decision_tree(tree_depth = integer(1), min_n = integer(1)) %>% 
  set_engine("spark") %>% 
  set_mode("classification") %>% 
  translate()
## Decision Tree Model Specification (classification)
## 
## Main Arguments:
##   tree_depth = integer(1)
##   min_n = integer(1)
## 
## Computational engine: spark 
## 
## Model fit template:
## sparklyr::ml_decision_tree_classifier(x = missing_arg(), formula = missing_arg(), 
##     max_depth = integer(1), min_instances_per_node = min_rows(0L, 
##         x), seed = sample.int(10^5, 1))

Translation from parsnip to the original package (regression)

decision_tree(tree_depth = integer(1), min_n = integer(1)) %>% 
  set_engine("spark") %>% 
  set_mode("regression") %>% 
  translate()
## Decision Tree Model Specification (regression)
## 
## Main Arguments:
##   tree_depth = integer(1)
##   min_n = integer(1)
## 
## Computational engine: spark 
## 
## Model fit template:
## sparklyr::ml_decision_tree_regressor(x = missing_arg(), formula = missing_arg(), 
##     max_depth = integer(1), min_instances_per_node = min_rows(0L, 
##         x), seed = sample.int(10^5, 1))

Preprocessing requirements

This engine does not require any special encoding of the predictors. Categorical predictors can be partitioned into groups of factor levels (e.g. {a, c} vs {b, d}) when splitting at a node. Dummy variables are not required for this model.

Case weights

This model can utilize case weights during model fitting. To use them, see the documentation in [case_weights] and the examples on tidymodels.org.

The fit() and fit_xy() arguments have arguments called case_weights that expect vectors of case weights.

Note that, for spark engines, the case_weight argument value should be a character string to specify the column with the numeric case weights.

Other details

For models created using the "spark" engine, there are several things to consider.

References



tidymodels/parsnip documentation built on Feb. 19, 2025, 2:10 a.m.