create.data.split: Split a dataset into training and a test sets.

Description Usage Arguments Details Value Examples

View source: R/create_data_split.r

Description

This function prepares the cross-validation by splitting the data into num.folds training and test folds for num.resample times.

Usage

1
2
create.data.split(siamcat, num.folds = 2, num.resample = 1,
    stratify = TRUE,inseparable = NULL, verbose = 1)

Arguments

siamcat

object of class siamcat-class

num.folds

number of cross-validation folds (needs to be >=2), defaults to 2

num.resample

resampling rounds (values <= 1 deactivate resampling), defaults to 1

stratify

boolean, should the splits be stratified so that an equal proportion of classes are present in each fold?, defaults to TRUE

inseparable

column name of metadata variable, defaults to NULL

verbose

control output: 0 for no output at all, 1 for only information about progress and success, 2 for normal level of information and 3 for full debug information, defaults to 1

Details

This function splits the labels within a siamcat-class object and prepares the internal cross-validation for the model training (see train.model).

The function saves the training and test instances for the different cross-validation folds within a list in the data_split-slot of the siamcat-class object, which is a list with four entries:

Value

object of class siamcat-class with the data_split-slot filled

Examples

1
2
3
4
5
6
7
8
    data(siamcat_example)
    # simple working example
    siamcat_split <- create.data.split(siamcat_example, num.folds=10,
    num.resample=5, stratify=TRUE)

    ## # example with a variable which is to be inseparable
    ## siamcat_split <- create.data.split(siamcat_example, num.folds=10,
    ##  num.resample=5, stratify=FALSE, inseparable='Gender')

KonradZych/SIAMCAT documentation built on May 17, 2019, 6:20 p.m.