split_random: Randomly split dataset in multiple parts

View source: R/Evaluate.R

split_randomR Documentation

Randomly split dataset in multiple parts

Description

The data.frame should start with a vector containing labels, or formula should be defined.

Usage

split_random(df, formula = NULL, splits = c(0.5, 0.5), min_class = 0)

Arguments

df

data.frame; Data frame of interest

formula

formula; Formula to indicate the outputs

splits

numeric; Probability of of assigning to each part, automatically normalized, should be >1

min_class

integer; minimum number of objects per class in each part

Value

list of data.frames

See Also

Other RSSL utilities: LearningCurveSSL(), SSLDataFrameToMatrices(), add_missinglabels_mar(), df_to_matrices(), measure_accuracy(), missing_labels(), split_dataset_ssl(), true_labels()

Examples

library(dplyr)

df <- generate2ClassGaussian(200,d=2)
dfs <- df %>% split_random(Class~.,split=c(0.5,0.3,0.2),min_class=1) 
names(dfs) <- c("Train","Validation","Test")
lapply(dfs,summary)


jkrijthe/RSSL documentation built on Jan. 13, 2024, 1:56 a.m.