split_df: Split a data frame into two parts

split_dfR Documentation

Split a data frame into two parts

Description

Splits data, Applies a stratified split to a data frame and returns each part. For more information about splitting options, and an extensive list of examples, see get_split_indexes_from_stratum.

Usage

split_df(data, stratification = NULL, ...)

Arguments

data

(data frame) Data to split, in long format, with one row per observation.

stratification

(vector). Vector that identifies which subsets of data should be split separately (denoted as strata in splitting functions) in order to ensure they are evenly distributed between parts. If NULL, all data is considered a single stratum.

...

Arguments passed on to get_split_indexes_from_stratum

method

(character) Splitting method. Note that first_second and odd_even splitting method will only deliver a valid split with default settings for other arguments (subsample_p = 1, split_p = 1, replace = TRUE)

replace

(logical) If FALSE, splits are constructed by sampling from stratum without replacement. If TRUE, stratum is sampled with replacement.

split_p

(numeric) Desired joint size of both parts, expressed as a proportion of the size of the subsampled stratum. If split_p is larger than 1, and careful is FALSE, then parts are automatically sampled with replacement

subsample_p

(numeric) Subsample a proportion of stratum to be used in the split.

careful

(boolean) If TRUE, stop with an error when called with arguments that may yield unexpected splits

Value

(list) List with two elements that each contain one of two parts.

See Also

Other splitting functions: apply_split_indexes_to_strata(), apply_split_indexes_to_stratum(), check_strata(), get_split_indexes_from_strata(), get_split_indexes_from_stratum(), split_strata(), split_stratum(), stratify()

Examples

ds <- data.frame(condition = rep(c("a", "b"), each = 4), score = 1 : 8)
split_df(ds, method = "random")
split_df(ds, method = "odd_even")
split_df(ds, method = "first_second")
split_df(ds, stratification = ds$condition, method = "random")
split_df(ds, stratification = ds$condition, method = "odd_even")
split_df(ds, stratification = ds$condition, method = "first_second")
ds <- data.frame(condition = rep(c("a", "b"), 4), score = 1 : 8)
split_df(ds, method = "random")
split_df(ds, method = "odd_even")
split_df(ds, method = "first_second")
split_df(ds, stratification = ds$condition, method = "random")
split_df(ds, stratification = ds$condition, method = "odd_even")
split_df(ds, stratification = ds$condition, method = "first_second")

splithalfr documentation built on Sept. 15, 2023, 1:08 a.m.