| cv | R Documentation |
Function to easily cross-validate (including fold assignation, merging fold outputs, etc).
cv(x, y, family = c("binomial", "cox", "gaussian"), fit_fun, predict_fun, site = NULL,
covar = NULL, nfolds = 10, pred.format = NA, verbose = TRUE, ...)
x |
predictors. A matrix or data.frame (rows are observations and columns are variables) or a vector of factor (if only one predictor). |
y |
response to be predicted. A binary vector for "binomial", a "Surv" object for "cox", or a numeric vector for "gaussian". |
family |
distribution of y: "binomial", "cox", or "gaussian". |
fit_fun |
function to create the prediction model using the training subsets. It can have between two and four arguments(the first two are compulsory): |
predict_fun |
function to apply the prediction model to the test sets. It can have between two and four arguments (the first two are compulsory): |
site |
vector or factor with the sites' names, or NULL for studies conducted in a single site. |
covar |
other covariates that can be passed to fit_fun and predict_fun. A matrix or data.frame (rows are observations and columns are variables) or a vector of factor (if only one covariate). |
... |
other arguments that can be passed to fit_fun and predict_fun. |
nfolds |
number of folds, only used if |
pred.format |
format of the predictions returned by each fold. E.g., if the prediction is an array, use NA. |
verbose |
(optional) logical, whether to print some messages during execution. |
This function iteratively divides the dataset into a training dataset, with which fits the model using the function fit_fun, and a test dataset, to which applies the model using the function predict_fun. It saves the models fit with the training datasets and the predictions obtained in the test datasets. The fols are assigned automatically using assign.folds, accounting for the site is this is not null.
A list with the predictions and the models used.
Joaquim Radua
glmnet_predict for obtaining predictions.
# Create random x (predictors) and y (binary)
x = matrix(rnorm(25000), ncol = 50)
y = 1 * (plogis(apply(x[,1:5], 1, sum) + rnorm(500, 0, 0.1)) > 0.5)
# Predict y via cross-validation
fit_fun = function (x_training, y_training) {
list(
lasso = glmnet_fit(x_training, y_training, family = "binomial")
)
}
predict_fun = function (m, x_test) {
glmnet_predict(m$lasso, x_test)
}
# Only 2 folds to ensure the example runs quickly
res = cv(x, y, family = "binomial", fit_fun = fit_fun, predict_fun = predict_fun, nfolds = 2)
# Show accuracy
se = mean(res$predictions$y.pred[res$predictions$y == 1] > 0.5)
sp = mean(res$predictions$y.pred[res$predictions$y == 0] < 0.5)
bac = (se + sp) / 2
cat("Sensitivity:", round(se, 2), "\n")
cat("Specificity:", round(sp, 2), "\n")
cat("Balanced accuracy:", round(bac, 2), "\n")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.