Description Usage Arguments Value Examples
Split data set into n sets based on target's distribution with or without observation repetition. Each set should contain at least one observation per class to keep needed distribution.
1 | get_data_types(df = df, target_column = "some_column")
|
df |
data.frame |
target_column |
string; name of the column which is assumed as target
column. |
ratio |
numeric vector; represents proportions how to split data.
|
replace |
boolean; sampling with or without replacement. |
list of n data frames with equal target distribtuion
1 2 3 4 5 6 7 8 | df <- data.frame(column1 = rep(TRUE,50),
column2 = c(LETTERS[1:25], LETTERS[26:2]),
column3 = seq(1,50),
column4 = c(rep("a",45), rep("b",5)),
column5 = seq(1,50,by=1),
target_col = c(rep("A",25), rep("B", 25)),
stringsAsFactors = FALSE)
split_data(df=df, target_column = "target_col", ratio = c(0.1, 0.4, 0.3, 0.2))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.