| assign.folds | R Documentation |
Function to assign observations to folds, ensuring a similar distribution across folds (and sites).
assign.folds(y, family = c("binomial", "cox", "gaussian"), site = NULL, nfolds = 10)
y |
response to be predicted. A binary vector for |
family |
distribution of y: |
site |
vector or factor with the sites' names, or NULL for studies conducted in a single site. |
nfolds |
number of folds. |
If family is "binomial", the function randomly assigns the folds separately for the two outcomes. If family is "gaussian", the function randomly assigns the folds separately for ranges of the outcome. If family is "gaussian", the function randomly assigns the folds separately for ranges of time and censorship. If site is not null, the function randomly assigns the folds separately for each site.
A numeric vector with the fold assigned to each observation
Joaquim Radua and Aleix Solanes
Solanes, A., Mezquida, G., Janssen, J., Amoretti, S., Lobo, A., Gonzalez-Pinto, A., Arango, C., Vieta, E., Castro-Fornieles, J., Berge, D., Albacete, A., Gine, E., Parellada, M., Bernardo, M.; PEPs group (collaborators); Pomarol-Clotet, E., Radua, J. (2022) Combining MRI and clinical data to detect high relapse risk after the first episode of psychosis. Schizophrenia, 8, 100, doi:10.1038/s41537-022-00309-w.
cv for conducting a cross-validation.
# Create random y (numeric)
y = rnorm(200, sample(c(1, 10), 200, replace = TRUE))
# Assign folds
fold = assign.folds(y, "gaussian", nfolds = 4)
# Check that the distribution of y is similar across folds
oldpar = par(mfrow = c(2, 2))
for (i in 1:4) {
hist(y[which(fold == i)], main = paste("Fold", i), xlab = "y")
}
par(oldpar)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.