View source: R/discretize_cart.R

step_discretize_cart | R Documentation |

`step_discretize_cart`

creates a *specification* of a recipe step that will
discretize numeric data (e.g. integers or doubles) into bins in a
supervised way using a CART model.

step_discretize_cart( recipe, ..., role = NA, trained = FALSE, outcome = NULL, cost_complexity = 0.01, tree_depth = 10, min_n = 20, rules = NULL, skip = FALSE, id = rand_id("discretize_cart") )

`recipe` |
A recipe object. The step will be added to the sequence of operations for this recipe. |

`...` |
One or more selector functions to choose which variables are
affected by the step. See |

`role` |
Defaults to |

`trained` |
A logical to indicate if the quantities for preprocessing have been estimated. |

`outcome` |
A call to |

`cost_complexity` |
The regularization parameter. Any split that does not
decrease the overall lack of fit by a factor of |

`tree_depth` |
The |

`min_n` |
The number of data points in a node required to continue
splitting. Corresponds to |

`rules` |
The splitting rules of the best CART tree to retain for each variable. If length zero, splitting could not be used on that column. |

`skip` |
A logical. Should the step be skipped when the
recipe is baked by |

`id` |
A character string that is unique to this step to identify it. |

`step_discretize_cart()`

creates non-uniform bins from numerical
variables by utilizing the information about the outcome variable and
applying a CART model.

The best selection of buckets for each variable is selected using the standard cost-complexity pruning of CART, which makes this discretization method resistant to overfitting.

This step requires the rpart package. If not installed, the step will stop with a note about installing the package.

Note that the original data will be replaced with the new bins.

An updated version of `recipe`

with the new step added to the
sequence of any existing operations.

When you `tidy()`

this step, a tibble with columns
`terms`

(the columns that is selected), `values`

is returned.

This step performs an supervised operation that can utilize case weights.
To use them, see the documentation in recipes::case_weights and the examples on
`tidymodels.org`

.

`step_discretize_xgb()`

, `recipes::recipe()`

,
`recipes::prep()`

, `recipes::bake()`

library(modeldata) data(ad_data) library(rsample) split <- initial_split(ad_data, strata = "Class") ad_data_tr <- training(split) ad_data_te <- testing(split) cart_rec <- recipe(Class ~ ., data = ad_data_tr) %>% step_discretize_cart(tau, age, p_tau, Ab_42, outcome = "Class", id = "cart splits") cart_rec <- prep(cart_rec, training = ad_data_tr) # The splits: tidy(cart_rec, id = "cart splits") bake(cart_rec, ad_data_te, tau)

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.