# step_num2factor: Convert Numbers to Factors In recipes: Preprocessing Tools to Create Design Matrices

## Description

`step_num2factor` will convert one or more numeric vectors to factors (ordered or unordered). This can be useful when categories are encoded as integers.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14``` ```step_num2factor( recipe, ..., role = NA, transform = function(x) x, trained = FALSE, levels, ordered = FALSE, skip = FALSE, id = rand_id("num2factor") ) ## S3 method for class 'step_num2factor' tidy(x, ...) ```

## Arguments

 `recipe` A recipe object. The step will be added to the sequence of operations for this recipe. `...` One or more selector functions to choose which variables will converted to factors. See `selections()` for more details. For the `tidy` method, these are not currently used. `role` Not used by this step since no new variables are created. `transform` A function taking a single argument `x` that can be used to modify the numeric values prior to determining the levels (perhaps using `base::as.integer()`). The output of a function should be an integer that corresponds to the value of `levels` that should be assigned. If not an integer, the value will be converted to an integer during `bake()`. `trained` A logical to indicate if the quantities for preprocessing have been estimated. `levels` A character vector of values that will be used as the levels. These are the numeric data converted to character and ordered. This is modified once `prep.recipe()` is executed. `ordered` A single logical value; should the factor(s) be ordered? `skip` A logical. Should the step be skipped when the recipe is baked by `bake.recipe()`? While all operations are baked when `prep.recipe()` is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using `skip = TRUE` as it may affect the computations for subsequent operations `id` A character string that is unique to this step to identify it. `x` A `step_num2factor` object.

## Value

An updated version of `recipe` with the new step added to the sequence of existing steps (if any). For the `tidy` method, a tibble with columns `terms` (the selectors or variables selected) and `ordered`.

`step_factor2string()`, `step_string2factor()`, `step_dummy()`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51``` ```library(dplyr) library(modeldata) data(attrition) attrition %>% group_by(StockOptionLevel) %>% count() amnt <- c("nothin", "meh", "some", "copious") rec <- recipe(Attrition ~ StockOptionLevel, data = attrition) %>% step_num2factor( StockOptionLevel, transform = function(x) x + 1, levels = amnt ) encoded <- rec %>% prep() %>% juice() table(encoded\$StockOptionLevel, attrition\$StockOptionLevel) # an example for binning binner <- function(x) { x <- cut(x, breaks = 1000 * c(0, 5, 10, 20), include.lowest = TRUE) # now return the group number as.numeric(x) } inc <- c("low", "med", "high") rec <- recipe(Attrition ~ MonthlyIncome, data = attrition) %>% step_num2factor( MonthlyIncome, transform = binner, levels = inc, ordered = TRUE ) %>% prep() encoded <- juice(rec) table(encoded\$MonthlyIncome, binner(attrition\$MonthlyIncome)) # What happens when a value is out of range? ceo <- attrition %>% slice(1) %>% mutate(MonthlyIncome = 10^10) bake(rec, ceo) ```