split_data: Split data by target's distrubtion
In agritag/infeR: Inferring information from data

Description Usage Arguments Value Examples

Split data set into n sets based on target's distribution with or without observation repetition. Each set should contain at least one observation per class to keep needed distribution.

1	get_data_types(df = df, target_column = "some_column")

`df`	data.frame
`target_column`	string; name of the column which is assumed as target column. `df` must contain this column.
`ratio`	numeric vector; represents proportions how to split data. `length(ratio) == n`.
`replace`	boolean; sampling with or without replacement.

list of n data frames with equal target distribtuion

df <- data.frame(column1 = rep(TRUE,50),
                 column2 = c(LETTERS[1:25], LETTERS[26:2]),
                 column3 = seq(1,50),
                 column4 = c(rep("a",45), rep("b",5)),
                 column5 = seq(1,50,by=1),
                 target_col = c(rep("A",25), rep("B", 25)),
                 stringsAsFactors = FALSE)
split_data(df=df, target_column = "target_col", ratio = c(0.1, 0.4, 0.3, 0.2))