split_test_train: Split into test and train data sets
In wactor: Word Factor Vectors

Description Usage Arguments Value Examples

View source: R/split_test_train.R

Randomly partition input into a list of train and test data sets

1	split_test_train(.data, .p = 0.8, ...)

`.data`	Input data. If atomic (numeric, integer, character, etc.), the input is first converted to a data frame with a column name of "x."
`.p`	Proportion of data that should be used for the `train` data set output. The default value is 0.80, meaning the `train` output will include roughly 80 pct. of the input cases while the `test` output will include roughly 20 oct..
`...`	Optional. The response (outcome) variable. Uses tidy evaluation (quotes are not necessary). This is only relevant if the identified variable is categorical–i.e., character, factor, logical–in which case it is used to ensure a uniform distribution for the `train` output data set. If a value is supplied, uniformity in response level observations is prioritized over the `.p` (train proportion) value.

A list with train and test tibbles (data.frames)

## example data frame
d <- data.frame(
  x = rnorm(100),
  y = rnorm(100),
  z = c(rep("a", 80), rep("b", 20))
)

## split using defaults
split_test_train(d)

## split 0.60/0.40
split_test_train(d, 0.60)

## split with equal response level obs
split_test_train(d, 0.80, label = z)

## apply to atomic data
split_test_train(letters)