CVrandomByID: Function to make a CV scheme based on random sampling of...

Description Usage Arguments Details Value Author(s) References Examples

Description

Function to make a CV scheme based on random sampling of observation IDs

Usage

1
CVrandomByID(ID, seed, k, exclusive)

Arguments

ID

character vector of the observation IDs used in the randomization. The names are a combination of the entry name and the location.

seed

numeric value for the seed value used for the randomization by the set.seed function. In this way randomization can be reproduced by the user. Default is NULL, which uses 123 as value for the seed.

k

integer value for the number of folds used in the k-cross-validation.

exclusive

logical whether sampling should be done with replacement. The argument is passed to the replace argument of the samp.int function as the negation, i.e. exclusive is TRUE means replace=FALSE, such that the probability of choosing the next item is proportional to the weights amongst the remaining items.

Details

for the randomization the sample function is used.

Value

named vector of numeric scores showing the assignment of the observations to their respective set used in the k-fold cross-validation.

Author(s)

Ruud Derijcker

References

Based on synbreed's crossVal function

Examples

1
2
3
4
5
6
7
8
9
data(exampleCV)
y <- exampleCV[,which(colnames(exampleCV) %in% c("GERMPLASM", "LOCATION"))]
colnames(y) <- c("IDUnique","FACTOR")
y$ID <- paste(y$IDUnique, y$FACTOR, sep="_")
y <- na.omit(y)
n <- length(y$ID)
output <- CVrandomByID(y$ID, seed=123, k=5, exclusive=TRUE)
table(output)
head(output)

digiYozhik/msc_thesis documentation built on May 14, 2019, 5:16 p.m.