segmkf | R Documentation |
Build segments of observations for K-Fold or "test-set" cross-validation (CV).
The CV can eventually be randomly repeated. For each repetition:
- K-fold CV - Function segmkf
returns the K
segments.
- Test-set CV - Function segmts
returns a segment (of a given length) randomly sampled in the dataset.
CV of blocks
Argument y
allows sampling blocks of observations instead of observations. This can be required when there are repetitions in the data. In such a situation, CV should account for the repetition level (if not, the error rates are in general highly underestimated). For implementing such a CV, object y
must be a a vector (n
) defining the blocks, in the same order as in the data.
In any cases (y = NULL
or not), the functions return a list of vector(s). Each vector contains the indexes of the observations defining the segment.
segmkf(n, y = NULL, K = 5,
type = c("random", "consecutive", "interleaved"), nrep = 1)
segmts(n, y = NULL, m, nrep)
n |
The total number of row observations in the dataset. If |
y |
A vector ( |
K |
The number of folds (i.e. segments) in the K-fold CV. |
type |
The type K-fold CV. Possible values are "random" (default), "consecutive" and "interleaved". |
m |
For |
nrep |
The number of replications of the repeated CV. Default to |
The segments (lists of indexes).
######### K-fold
segmkf(n = 10, K = 3)
segmkf(n = 10, K = 3, type = "interleaved")
# Leave-one-out
segmkf(n = 10, K = 10)
# Repeated
segmkf(n = 10, K = 3, nrep = 2)
######### Test-set (repeated)
segmts(n = 10, m = 3, nrep = 5)
######### With blocks
n <- 10
y <- rep(LETTERS[1:5], 2)
y
z <- segmkf(n = n, y = y, K = 3, nrep = 1)
z
y[z$rep1$segm1]
y[z$rep1$segm2]
y[z$rep1$segm3]
z <- segmts(n = n, y = y, m = 3, nrep = 1)
z
y[z$rep1$segm1]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.