trainTestSplit: Split your data into a training and testing set

Description Usage Arguments Value Examples

View source: R/trainTestSplit.R

Description

Create a training and testing data set. Also returns a bootstrapped version of the training data set.

Usage

1
2
3
4
5
6
7
8
trainTestSplit(
  data = df,
  splitAmt = 0.8,
  timeDependent = FALSE,
  responseVar = "nameOfResponseVar",
  stratifyOnResponse = FALSE,
  numberOfBootstrapSamples = 25
)

Arguments

data

The data set of interest.

splitAmt

The amount of data you want in the training set. Default is .8

timeDependent

Logical. Is your data time-dependent? If so, set TRUE.

responseVar

Name of response variable in analysis.

stratifyOnResponse

Logical. Should the training and testing splits be stratified based on the response? If so, set TRUE.

numberOfBootstrapSamples

Numeric. How many bootstrap samples do you want? Default is 25.

Value

A list with four components: train is the training set, test is the testing set, boot is a bootstrapped data set, and split is an rsample object that helps split your original data set.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
library(easytidymodels)
library(dplyr)
utils::data(penguins, package = "modeldata")
resp <- "sex"
split <- trainTestSplit(penguins, stratifyOnResponse = TRUE, responseVar = resp)
#Training data
split$train

#Testing data
split$test

#Bootstrapped data
split$boot

#Split object (helpful to call if you want to do model stacking)
split$split

amanda-park/easytidymodels documentation built on Dec. 13, 2021, 11:28 a.m.