cv_auc: Cross-validated area under the receiver operating...
In benkeser/predtmle: Small sample estimators of cross-validated prediction metrics

Description Usage Arguments Details Value Examples

This function computes K-fold cross-validated estimates of the area under the receiver operating characteristics (ROC) curve (hereafter, AUC). This quantity can be interpreted as the probability that a randomly selected case will have higher predicted risk than a randomly selected control.

1
2
3

cv_auc(Y, X, K = 10, learner = "glm_wrapper", nested_cv = TRUE,
  nested_K = K - 1, parallel = FALSE, max_cvtmle_iter = 10,
  cvtmle_ictol = 1/length(Y), prediction_list = NULL, ...)

`Y`	A numeric vector of outcomes, assume to equal `0` or `1`.
`X`	A `data.frame` or `matrix` of variables for prediction.
`K`	The number of cross-validation folds (default is `10`).
`learner`	A wrapper that implements the desired method for building a prediction algorithm. See TODO: ADD DOCUMENTATION FOR WRITING
`nested_cv`	A boolean indicating whether nested cross validation should be used to estimate the distribution of the prediction function. Default (`TRUE`) is best choice for aggressive `learner`'s, while `FALSE` is reasonable for smooth `learner`'s (e.g., logistic regression).
`nested_K`	If nested cross validation is used, how many inner folds should there be? Default (`K-1`) affords quicker computation by reusing training fold learner fits.
`parallel`	A boolean indicating whether prediction algorithms should be trained in parallel. Default to `FALSE`.
`max_cvtmle_iter`	Maximum number of iterations for the bias correction step of the CV-TMLE estimator (default `10`).
`cvtmle_ictol`	The CV-TMLE will iterate `max_cvtmle_iter` is reached or mean of cross-validated efficient influence function is less than `cvtmle_ictol`.
`prediction_list`	For power users: a list of predictions made by `learner` that has a format compatible with `cvauc`.
`...`	Other arguments, not currently used

To estimate the AUC of a particular prediction algorithm, K-fold cross-validation is commonly used. The data are partitioned into K distinct groups. The prediction algorithm is developed using K-1 of these groups. In standard K-fold cross-validation, the AUC of this prediction algorithm is estimated using the remaining fold

A list TO DO: More documentation here

1 2 3 4 5	n <- 200 p <- 10 X <- data.frame(matrix(rnorm(n*p), nrow = n, ncol = p)) Y <- rbinom(n, 1, plogis(X[,1] + X[,10])) fit <- cv_auc(Y = Y, X = X, K = 5, learner = "glm_wrapper")