splitDataset: Split a data set for machine learning classification

Description Usage Arguments Value See Also Examples

View source: R/simulation.R

Description

Return data.sets as a list of training set, holdout set and validation set according to the predefined percentage of each partition default is a 50-50 split into training and holdout, no testing set code class/label/phenotypes as 1 and -1. User can manage the simulation data to be dichotomious/quantitative using label (class/phenos)

Usage

1
2
splitDataset(all.data = NULL, pct.train = 0.5, pct.holdout = 0.5,
  pct.validation = 0, label = "class")

Arguments

all.data

A data frame of n rows by d colums of data plus a label column

pct.train

A numeric percentage of samples to use for traning

pct.holdout

A numeric percentage of samples to use for holdout

pct.validation

A numeric percentage of samples to use for testing

label

A character vector of the data column name for the outcome label. class for classification and phenos for regression.

Value

A list containing:

train

traing data set

holdout

holdout data set

validation

validation data set

See Also

Other simulation: createInteractions, createMainEffects, createMixedSimulation, createSimulation

Examples

1
2
data("rsfMRIcorrMDD")
data.sets <- splitDataset(rsfMRIcorrMDD)

hexhead/privateEC documentation built on July 20, 2018, 12:30 p.m.