check_class: Check Variable Class

Description Usage Arguments Details Value See Also Examples

View source: R/class.R

Description

check_class creates a specification of a recipe check that will check if a variable is of a designated class.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
check_class(
  recipe,
  ...,
  role = NA,
  trained = FALSE,
  class_nm = NULL,
  allow_additional = FALSE,
  skip = FALSE,
  class_list = NULL,
  id = rand_id("class")
)

## S3 method for class 'check_class'
tidy(x, ...)

Arguments

recipe

A recipe object. The check will be added to the sequence of operations for this recipe.

...

One or more selector functions to choose which variables are affected by the check. See selections() for more details. For the tidy method, these are not currently used.

role

Not used by this check since no new variables are created.

trained

A logical to indicate if the quantities for preprocessing have been estimated.

class_nm

A character vector that will be used in inherits to check the class. If NULL the classes will be learned in prep. Can contain more than one class.

allow_additional

If TRUE a variable is allowed to have additional classes to the one(s) that are checked.

skip

A logical. Should the check be skipped when the recipe is baked by bake.recipe()? While all operations are baked when prep.recipe() is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using skip = TRUE as it may affect the computations for subsequent operations.

class_list

A named list of column classes. This is NULL until computed by prep.recipe().

id

A character string that is unique to this step to identify it.

x

A check_class object.

Details

This function can check the classes of the variables in two ways. When the class argument is provided it will check if all the variables specified are of the given class. If this argument is NULL, the check will learn the classes of each of the specified variables in prep. Both ways will break bake if the variables are not of the requested class. If a variable has multiple classes in prep, all the classes are checked. Please note that in prep the argument strings_as_factors defaults to TRUE. If the train set contains character variables the check will be break bake when strings_as_factors is TRUE.

Value

An updated version of recipe with the new check added to the sequence of existing steps (if any). For the tidy method, a tibble with columns terms (the selectors or variables selected) and value (the type).

See Also

recipe() prep.recipe() bake.recipe()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
library(dplyr)
library(modeldata)
data(okc)

# Learn the classes on the train set
train <- okc[1:1000, ]
test  <- okc[1001:2000, ]
recipe(train, age ~ . ) %>%
  check_class(everything()) %>%
  prep(train, strings_as_factors = FALSE) %>%
  bake(test)

# Manual specification
recipe(train, age ~ .) %>%
  check_class(age, class_nm = "integer") %>%
  check_class(diet, location, class_nm = "character") %>%
  check_class(date, class_nm = "Date") %>%
  prep(train, strings_as_factors = FALSE) %>%
  bake(test)

# By default only the classes that are specified
#   are allowed.
x_df <- tibble(time = c(Sys.time() - 60, Sys.time()))
x_df$time %>% class()
## Not run: 
recipe(x_df) %>%
  check_class(time, class_nm = "POSIXt") %>%
  prep(x_df) %>%
  bake_(x_df)

## End(Not run)

# Use allow_additional = TRUE if you are fine with it
recipe(x_df) %>%
  check_class(time, class_nm = "POSIXt", allow_additional = TRUE) %>%
  prep(x_df) %>%
  bake(x_df)

recipes documentation built on July 2, 2020, 4:02 a.m.