segmkf | R Documentation |
Build segments of observations for K-Fold or "test-set" (i.e. Monte Carlo) cross-validation (CV).
The CV can eventually be randomly repeated. For each repetition:
- Function segmkf
(K-fold CV) returns the K
segments.
- Function segmts
(test-set CV) returns a segment (of a given length) randomly sampled in the data set.
CV of blocks
Argument y
allows sampling blocks of observations instead of observations. This can be useful for instance when there are repetitions in the data. In such a situation, CV should account for the repetition level (if not, error rates will in general be strongly underestimated). For implementing such a CV, object y
must be a a vector of length n
defining the blocks (in the same order as the data).
In any cases (y = NULL
or not), the functions return a list of vector(s). Each vector contains the indexes of the observations defining the segment.
segmkf(n, y = NULL, K = 5, type = c("random", "consecutive", "interleaved"),
nrep = 1, seed = NULL)
segmts(n, y = NULL, m, nrep, seed = NULL)
n |
The total number of row observations in the data set. If |
y |
A vector of length |
K |
For |
type |
For |
m |
For |
nrep |
The number of replications of the repeated CV. Default to |
seed |
An integer defining the seed for the random simulation, or |
The segments (list of indexes).
######### K-fold
segmkf(n = 10, K = 3)
segmkf(n = 10, K = 3, type = "interleaved")
# Leave-one-out
segmkf(n = 10, K = 10)
# Repeated
segmkf(n = 10, K = 3, nrep = 2)
######### Test-set (repeated)
segmts(n = 10, m = 3, nrep = 5)
######### With blocks
n <- 10
y <- rep(LETTERS[1:5], 2)
y
z <- segmkf(n = n, y = y, K = 3, nrep = 1)
z
y[z$rep1$segm1]
y[z$rep1$segm2]
y[z$rep1$segm3]
z <- segmts(n = n, y = y, m = 3, nrep = 1)
z
y[z$rep1$segm1]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.