supervised_data: Data Splitting for Supervised Machine Learning

Description Usage Arguments Value Examples

View source: R/supervised_data.R

Description

A function that utilizes tidymodels's initial_split function to perform data spltting while providing convenient access to X and y portions of both the test split and the train split.

Usage

1
supervised_data(data, xcols, ycols, ...)

Arguments

data

the original dataset to be used for splitting

xcols

a vector containing feature names (X) to be used as independent variables

ycols

a vector containing target names (y) to be used as dependent variables or labels

...

Additional parameters to pass to the initial_split function in tidymodels. See tidymodels documentation for more details

Value

A list of the following components.

Examples

1
2
3
4
5
6
7
8
set.seed(1353)
cars <- supervised_data(mtcars, xcols = c('mpg', 'cyl', 'disp'), ycols=c('hp'))
train_data <- cars$train
test_data <- cars$test
x_train <- cars$xtrain
y_train <- cars$ytrain
x_test <- cars$xtest
y_test <- cars$ytest

UBC-MDS/Rmleda documentation built on March 29, 2021, 7:04 a.m.