split_random: Randomly split dataset in multiple parts
In RSSL: Implementations of Semi-Supervised Learning Approaches for Classification

split_random

R Documentation

Randomly split dataset in multiple parts

Description

The data.frame should start with a vector containing labels, or formula should be defined.

Usage

split_random(df, formula = NULL, splits = c(0.5, 0.5), min_class = 0)

Arguments

`df`	data.frame; Data frame of interest
`formula`	formula; Formula to indicate the outputs
`splits`	numeric; Probability of of assigning to each part, automatically normalized, should be >1
`min_class`	integer; minimum number of objects per class in each part

Value

list of data.frames

Examples

library(dplyr)

df <- generate2ClassGaussian(200,d=2)
dfs <- df %>% split_random(Class~.,split=c(0.5,0.3,0.2),min_class=1) 
names(dfs) <- c("Train","Validation","Test")
lapply(dfs,summary)

RSSL documentation built on May 29, 2024, 2:38 a.m.