subjectSplitter: Split data when patients are in the data multiple times such...

Description Usage Arguments Details Value

View source: R/DataSplitting.R

Description

Split data when patients are in the data multiple times such that the same patient is always either in the train set or the test set (the same patient cannot be in both the test and train set at different times)

Usage

1
subjectSplitter(population, test = 0.3, train = NULL, nfold = 3, seed = NULL)

Arguments

population

An object created using createStudyPopulation().

test

A real number between 0 and 1 indicating the test set fraction of the data

train

A real number between 0 and 1 indicating the train set fraction of the data. If not set train is equal to 1 - test

nfold

An integer >= 1 specifying the number of folds used in cross validation

seed

If set a fixed seed is used, otherwise a random split is performed

Details

Returns a dataframe of rowIds and indexes with a -1 index indicating the rowId belongs to the test set and a positive integer index value indicating the rowId's cross valiation fold within the train set.

Value

A dataframe containing the columns: rowId and index


OHDSI/PatientLevelPrediction documentation built on Aug. 30, 2020, 9:33 a.m.