splitDataToTrainTestStratified: Function to do stratified sampling for model data

Description Usage Arguments Value Examples

View source: R/dataManipulations.R

Description

Function to stratified split a dataset and return training, test and eval sets as data.frame or data.table

Usage

1
2
splitDataToTrainTestStratified(x, strataCols = NULL, trainProportion,
  evalProportion = NULL, levels = 50, seed = 999, DT = T)

Arguments

x

A data.frame or data.table object containing all data to be split

strataCols

Names of columns or features that will be used to do stratification

trainProportion

Proportion of training set

evalProportion

Proportion of validation set

levels

Threshold of determine a feature to be categorical and qualified for stratification

seed

Random seeds for reproducibility

DT

If the function returns a list of data.tables or data.frames

Value

A list of data.table or data.frame

Examples

1
2
3
4
5
data(mtcars)
res <- splitDataToTrainTestStratified(mtcars, names(mtcars), 0.7, 0.3, 15, 999, T)
res$trainData
res$testData
res$evalData

ivanliu1989/RQuant documentation built on Sept. 13, 2019, 11:53 a.m.