process_transform_throw_error: Validate and clean transform function output

Description Usage Arguments Value Examples

View source: R/utils.R

Description

Helper function that ensures the output of applying a transform function is a data.frame and that this data frame does not duplicate variables from the original (input data) data frame. If duplicates are found they are automatically dropped from the data.frame that is returned by this function.

Usage

1
process_transform_throw_error(input_df, output_df, func_name)

Arguments

input_df

The original (input data) data.frame - the transform function's argument.

output_df

The the transform function's output.

func_name

The name of the ml_pipeline_builder trandform method.

Value

If the transform function is not NULL then a copy of the transform function's output data.frame, with any duplicated inputs removed.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
transform_method <- function(df) cbind_fast(df, q = df$y * df$y)
data <- data.frame(y = c(1, 2), x = c(0.1, 0.2))
data_transformed <- transform_method(data)
process_transform_throw_error(data, data_transformed, "transform_method")
# transform_method yields data.frame that duplicates input vars - dropping the following
columns: 'y', 'x'
# q
# 1 1
# 2 4

## End(Not run)

AlexIoannides/pipeliner documentation built on May 5, 2019, 4:52 a.m.