Description Usage Arguments Value Author(s) Examples
This function takes in a factor with class labels of the total dataset,
draws a sample (balanced with respect to the different levels of the factor)
and returns a logical vector indicating whether the observation is in the
learning sample (TRUE
) or not (FALSE
).
1 2 | inTrainingSample(y, propTraining = 2/3, classdist = c("balanced",
"unbalanced"))
|
y |
factor with the class labels for the total data set |
propTraining |
proportion of the data that should be in a training set; the default value is 2/3. |
classdist |
distribution of classes; allows to indicate whether your distribution 'balanced' or 'unbalanced'. The sampling strategy for each run is adapted accordingly. |
logical vector indicating for each observation in y
whether
the observation is in the learning sample (TRUE
) or not
(FALSE
)
Willem Talloen and Tobias Verbeke
1 2 3 4 5 6 | ### this example demonstrates the logic of sampling in case of unbalanced distribution of classes
y <- factor(c(rep("A", 21), rep("B", 80)))
nlcv:::inTrainingSample(y, 2/3, "unbalanced")
table(y[nlcv:::inTrainingSample(y, 2/3, "unbalanced")]) # should be 14, 14 (for A, B resp.)
table(y[!nlcv:::inTrainingSample(y, 2/3, "unbalanced")]) # should be 7, 66 (for A, B resp.)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.