pre.process: Pre-processing of data

Description Usage Arguments Value Author(s) Examples

View source: R/pre_process.R

Description

Pre-processes data based on a defined pipeline. If the mapping list is provided in addition to the pipeline then the function runs in production mode.

Usage

1
2
pre.process(data, x = NULL, y = NULL, id.feats = NULL, pipeline,
  mapping.list = NULL, verbose = TRUE)

Arguments

data

[required | data.frame] Data frame to pre-process

x

[optional | character | default=NULL] A vector of feature names present in the dataset to be used as input features. If NULL uses all of the features in the dataset except for id and y features if provided.

y

[optional | character | default=NULL] The name of the target feature contained in the dataset. If NULL then data will be processed without relying on the target feature.

id.feats

[optional | character | default=NULL] Names of ID features. If provided then the data set will be de-duplicated according to the ID features provided.

pipeline

[required | list] List object. When initially preparing data the list object is returned by the function design.pipeline else when processing data again eg. a production environment the list object from the function pre.process.

mapping.list

[optional | list | default=NULL] List of mapping tables. If NULL then function will run in development mode and produce a list of mapping tables included in the output.

verbose

[logical | optional | default=TRUE] Chatty or silent function output.

Value

List containing a pre-processed dataset, mapping list and designed pipeline.

Author(s)

Xander Horn

Examples

1
res <- pre.process(data = iris, y = "Species", pipeline = design.pipeline(pipeline.name = "iris.quick", kmeans.features = T))

XanderHorn/lazy documentation built on Jan. 16, 2021, 6:15 p.m.