CVrandomByID: Function to make a CV scheme based on random sampling of...
In digiYozhik/msc_thesis: Functions to support master thesis

Description Usage Arguments Details Value Author(s) References Examples

Function to make a CV scheme based on random sampling of observation IDs

1	CVrandomByID(ID, seed, k, exclusive)

`ID`	character vector of the observation IDs used in the randomization. The names are a combination of the entry name and the location.
`seed`	numeric value for the seed value used for the randomization by the set.seed function. In this way randomization can be reproduced by the user. Default is NULL, which uses 123 as value for the seed.
`k`	integer value for the number of folds used in the k-cross-validation.
`exclusive`	logical whether sampling should be done with replacement. The argument is passed to the replace argument of the samp.int function as the negation, i.e. exclusive is TRUE means replace=FALSE, such that the probability of choosing the next item is proportional to the weights amongst the remaining items.

for the randomization the sample function is used.

named vector of numeric scores showing the assignment of the observations to their respective set used in the k-fold cross-validation.

Ruud Derijcker

Based on synbreed's crossVal function

data(exampleCV)
y <- exampleCV[,which(colnames(exampleCV) %in% c("GERMPLASM", "LOCATION"))]
colnames(y) <- c("IDUnique","FACTOR")
y$ID <- paste(y$IDUnique, y$FACTOR, sep="_")
y <- na.omit(y)
n <- length(y$ID)
output <- CVrandomByID(y$ID, seed=123, k=5, exclusive=TRUE)
table(output)
head(output)