# est_predictiveness_cv: Estimate a nonparametric predictiveness functional using... In bdwilliamson/nova: Perform Inference on Algorithm-Agnostic Variable Importance

 est_predictiveness_cv R Documentation

## Estimate a nonparametric predictiveness functional using cross-fitting

### Description

Compute nonparametric estimates of the chosen measure of predictiveness.

### Usage

```est_predictiveness_cv(
fitted_values,
y,
full_y = NULL,
folds,
type = "r_squared",
C = rep(1, length(y)),
Z = NULL,
folds_Z = folds,
ipc_weights = rep(1, length(C)),
ipc_fit_type = "external",
ipc_eif_preds = rep(1, length(C)),
ipc_est_type = "aipw",
scale = "identity",
na.rm = FALSE,
...
)
```

### Arguments

 `fitted_values` fitted values from a regression function using the observed data; a list of length V, where each object is a set of predictions on the validation data, or a vector of the same length as `y`. `y` the observed outcome. `full_y` the observed outcome (from the entire dataset, for cross-fitted estimates). `folds` the cross-validation folds for the observed data. `type` which parameter are you estimating (defaults to `r_squared`, for R-squared-based variable importance)? `C` the indicator of coarsening (1 denotes observed, 0 denotes unobserved). `Z` either `NULL` (if no coarsening) or a matrix-like object containing the fully observed data. `folds_Z` either the cross-validation folds for the observed data (no coarsening) or a vector of folds for the fully observed data Z. `ipc_weights` weights for inverse probability of coarsening (e.g., inverse weights from a two-phase sample) weighted estimation. Assumed to be already inverted (i.e., ipc_weights = 1 / [estimated probability weights]). `ipc_fit_type` if "external", then use `ipc_eif_preds`; if "SL", fit a SuperLearner to determine the correction to the efficient influence function. `ipc_eif_preds` if `ipc_fit_type = "external"`, the fitted values from a regression of the full-data EIF on the fully observed covariates/outcome; otherwise, not used. `ipc_est_type` IPC correction, either `"ipw"` (for classical inverse probability weighting) or `"aipw"` (for augmented inverse probability weighting; the default). `scale` if doing an IPC correction, then the scale that the correction should be computed on (e.g., "identity"; or "logit" to logit-transform, apply the correction, and back-transform). `na.rm` logical; should NA's be removed in computation? (defaults to `FALSE`) `...` other arguments to SuperLearner, if `ipc_fit_type = "SL"`.

### Details

See the paper by Williamson, Gilbert, Simon, and Carone for more details on the mathematics behind this function and the definition of the parameter of interest. If sample-splitting is also requested (recommended, since in this case inferences will be valid even if the variable has zero true importance), then the prediction functions are trained as if 2K-fold cross-validation were run, but are evaluated on only K sets (independent between the full and reduced nuisance regression).

### Value

The estimated measure of predictiveness.

bdwilliamson/nova documentation built on Feb. 13, 2023, 9:52 a.m.