Description Usage Arguments Details Value See Also Examples
A flexible interface for computing cross-validation-based measures of maximal association. In an outer layer of V-fold cross validation, training samples are used to train a prediction algorithm for each outcome. Multiple algorithms may be ensembled using stacking (also known as super learning) based on V-2 fold cross-validation. An inner layer of V-1 cross validation is used to determine a user-specified combination of outcomes that maximizes a user-specified prediction criteria. The outer layer validation sample is used to compute a user-specified cross-validated measure of performance of the prediction algorithm for predicting the combined outcome that was computed in the training sample. Several common choices for outcome combinations (convex combination of outcomes and single outcome that is most associated) and prediction criteria (nonparametric R^2, negative log-likelihood, and area under ROC curve) are included; however, users may specify their own criteria as well. The function returns the cross-validated summary measure for the maximally combined outcome and, if desired, the cross-validated summary measure for each outcome.
1 2 3 4 5 6 7 8 9 | cvma(Y, X, V = 5, learners, sl_control = list(ensemble_fn =
"ensemble_linear", optim_risk_fn = "optim_risk_sl_se", weight_fn =
"weight_sl_convex", cv_risk_fn = "cv_risk_sl_r2", family = gaussian(),
alpha = 0.05), y_weight_control = list(ensemble_fn = "ensemble_linear",
weight_fn = "weight_y_convex", optim_risk_fn = "optim_risk_y_r2",
cv_risk_fn = "cv_risk_y_r2", alpha = 0.05),
return_control = list(outer_weight = TRUE, outer_sl = TRUE, inner_sl =
FALSE, all_y = TRUE, all_learner_assoc = TRUE, all_learner_fits = FALSE),
scale = FALSE)
|
Y |
A matrix or data.frame of outcomes |
X |
A matrix or data.frame of predictors |
V |
Number of outer folds of cross-validation (nested cross-validation uses V-1 and V-2 folds), so must be at least four. |
learners |
Super learner wrappers. See |
sl_control |
A list with named entries ensemble_fn, optim_risk_fn, weight_fn,
cv_risk_fn, family. Available functions can be viewed with |
y_weight_control |
A list with named entries ensemble_fn, optim_risk_fn, weight_fn,
cv_risk_fn. Available functions can be viewed with |
return_control |
A list with named entries |
scale |
Standardize each outcome to be mean zero with standard deviation 1. |
TO DO: Figure out how future works (e.g., can plan() be specified internally or externally?)
cv_assoc
returns risk for the entire procedure. The cv_assoc_all_y
will return
cross-validated performance metric for all the outcomes, including the confidence interval, p-value and
influence curve. all_learner_assoc
will return for each outcome and learner
cross-validated metric, confidence interval, associated p-value and influence curve.
The sl_fit
will return Super Learner fit for each outcome and associated learner risks on
all the data. In addition, it will return the fit for all learners based on all folds.
The outer_weight
will return the outcome weights obtained using outer-most fold of CV.
inner_weight
returns outcome weights obtained using inner-most fold of CV. Additinally,
all_learner_fits
returns all learner fits.
TO DO: Should cvma have $cv_measure, $ci_low, $ci_high, and $p_value returned?
These are seen when the objects are printed so it may be natural for users to think
that those are named in the cvma object.
predict method
1 2 3 4 5 6 7 8 9 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.