split_data: AutoScore Function: Automatically splitting dataset to train,...

View source: R/common.R

split_dataR Documentation

AutoScore Function: Automatically splitting dataset to train, validation and test set, possibly stratified by label

Description

AutoScore Function: Automatically splitting dataset to train, validation and test set, possibly stratified by label

Usage

split_data(data, ratio, cross_validation = FALSE, strat_by_label = FALSE)

Arguments

data

The dataset to be split

ratio

The ratio for dividing dataset into training, validation and testing set. (Default: c(0.7, 0.1, 0.2))

cross_validation

If set to TRUE, cross-validation would be used for generating parsimony plot, which is suitable for small-size data. Default to FALSE

strat_by_label

If set to TRUE, data splitting is stratified on the outcome variable. Default to FALSE

Value

Returns a list containing training, validation and testing set

Examples

data("sample_data")
names(sample_data)[names(sample_data) == "Mortality_inpatient"] <- "label"
set.seed(4)
#large sample size
out_split <- split_data(data = sample_data, ratio = c(0.7, 0.1, 0.2))
#small sample size
out_split <- split_data(data = sample_data, ratio = c(0.7, 0, 0.3),
                        cross_validation = TRUE)
#large sample size, stratified
out_split <- split_data(data = sample_data, ratio = c(0.7, 0.1, 0.2),
                        strat_by_label = TRUE)

AutoScore documentation built on Oct. 16, 2022, 1:06 a.m.