Split: Split

Description Usage Arguments Format Details Value Methods Author(s) See Also

Description

Split Splits a corpus into a training, test and optional validation set.

Usage

1

Arguments

corpus

Corpus object.

name

Character string indicating the name for the cross-validation set.

train

Numeric indicating the proportion of the Corpus to allocate to the training set. Acceptable values are between 0 and 1. The total of the values for the train, validation and test parameters must equal 1.

validation

Numeric indicating the proportion of the Corpus to allocate to the validation set. Acceptable values are between 0 and 1. The total of the values for the train, validation and test parameters must equal 1.

test

Numeric indicating the proportion of the Corpus to allocate to the test set. Acceptable values are between 0 and 1. The total of the values for the train, validation and test parameters must equal 1.

stratify

Logical. If TRUE (default), splits and sampling will be stratefied.

seed

Numeric used to initialize a pseudorandom number generator.

Format

An object of class R6ClassGenerator of length 24.

Details

Splits a corpus into a training, test and optional validation set. These corpora are combined into a single cross-validation set or CVSet object.

Value

CVSet object

Methods

Author(s)

John James, jjames@datasciencesalon.org

See Also

Other CorpusStudio Family of Classes: CorpusStudio, KFold, Sample0, Sample, Segment, TokenizerNLP, TokenizerQ, Tokenizer, Token


DecisionScients/NLPStudio documentation built on May 15, 2019, 12:51 p.m.