step_untokenize: Untokenization of list-column variables

Description Usage Arguments Details Value Examples

Description

'step_untokenize' creates a *specification* of a recipe step that will convert a list of tokens into a character predictor.

Usage

1
2
3
4
5
6
step_untokenize(recipe, ..., role = NA, trained = FALSE,
  columns = NULL, sep = " ", skip = FALSE,
  id = rand_id("untokenize"))

## S3 method for class 'step_untokenize'
tidy(x, ...)

Arguments

recipe

A recipe object. The step will be added to the sequence of operations for this recipe.

...

One or more selector functions to choose variables. For 'step_untokenize', this indicates the variables to be encoded into a list column. See [recipes::selections()] for more details. For the 'tidy' method, these are not currently used.

role

Not used by this step since no new variables are created.

trained

A logical to indicate if the recipe has been baked.

columns

A list of tibble results that define the encoding. This is 'NULL' until the step is trained by [recipes::prep.recipe()].

sep

a character to determine how the tokens should be seperated when pasted together. Defaults to '" "'.

skip

A logical. Should the step be skipped when the recipe is baked by [recipes::bake.recipe()]? While all operations are baked when [recipes::prep.recipe()] is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using 'skip = TRUE' as it may affect the computations for subsequent operations.

id

A character string that is unique to this step to identify it.

x

A 'step_untokenize' object.

Details

This steps will turn a tokenized list-column back into a character vector.

Value

An updated version of 'recipe' with the new step added to the sequence of existing steps (if any).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
library(recipes)

data(okc_text)

okc_rec <- recipe(~ ., data = okc_text) %>%
  step_tokenize(essay0) %>%
  step_untokenize(essay0) 
  
okc_obj <- okc_rec %>%
  prep(training = okc_text, retain = TRUE)

juice(okc_obj, essay0) %>% 
  slice(1:2)

juice(okc_obj) %>% 
  slice(2) %>% 
  pull(essay0) 
  
tidy(okc_rec, number = 2)
tidy(okc_obj, number = 2)

textrecipes documentation built on May 2, 2019, 1:27 p.m.