split_test_train: Split into test and train data sets

Description Usage Arguments Value Examples

View source: R/split_test_train.R

Description

Randomly partition input into a list of train and test data sets

Usage

1
split_test_train(.data, .p = 0.8, ...)

Arguments

.data

Input data. If atomic (numeric, integer, character, etc.), the input is first converted to a data frame with a column name of "x."

.p

Proportion of data that should be used for the train data set output. The default value is 0.80, meaning the train output will include roughly 80 pct. of the input cases while the test output will include roughly 20 oct..

...

Optional. The response (outcome) variable. Uses tidy evaluation (quotes are not necessary). This is only relevant if the identified variable is categorical–i.e., character, factor, logical–in which case it is used to ensure a uniform distribution for the train output data set. If a value is supplied, uniformity in response level observations is prioritized over the .p (train proportion) value.

Value

A list with train and test tibbles (data.frames)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## example data frame
d <- data.frame(
  x = rnorm(100),
  y = rnorm(100),
  z = c(rep("a", 80), rep("b", 20))
)

## split using defaults
split_test_train(d)

## split 0.60/0.40
split_test_train(d, 0.60)

## split with equal response level obs
split_test_train(d, 0.80, label = z)

## apply to atomic data
split_test_train(letters)

wactor documentation built on Dec. 18, 2019, 5:07 p.m.