est_predictiveness_cv: Estimate a nonparametric predictiveness functional using...

View source: R/est_predictiveness_cv.R

est_predictiveness_cvR Documentation

Estimate a nonparametric predictiveness functional using cross-fitting

Description

Compute nonparametric estimates of the chosen measure of predictiveness.

Usage

est_predictiveness_cv(
  fitted_values,
  y,
  full_y = NULL,
  folds,
  type = "r_squared",
  C = rep(1, length(y)),
  Z = NULL,
  folds_Z = folds,
  ipc_weights = rep(1, length(C)),
  ipc_fit_type = "external",
  ipc_eif_preds = rep(1, length(C)),
  ipc_est_type = "aipw",
  scale = "identity",
  na.rm = FALSE,
  ...
)

Arguments

fitted_values

fitted values from a regression function using the observed data; a list of length V, where each object is a set of predictions on the validation data, or a vector of the same length as y.

y

the observed outcome.

full_y

the observed outcome (from the entire dataset, for cross-fitted estimates).

folds

the cross-validation folds for the observed data.

type

which parameter are you estimating (defaults to r_squared, for R-squared-based variable importance)?

C

the indicator of coarsening (1 denotes observed, 0 denotes unobserved).

Z

either NULL (if no coarsening) or a matrix-like object containing the fully observed data.

folds_Z

either the cross-validation folds for the observed data (no coarsening) or a vector of folds for the fully observed data Z.

ipc_weights

weights for inverse probability of coarsening (e.g., inverse weights from a two-phase sample) weighted estimation. Assumed to be already inverted (i.e., ipc_weights = 1 / [estimated probability weights]).

ipc_fit_type

if "external", then use ipc_eif_preds; if "SL", fit a SuperLearner to determine the correction to the efficient influence function.

ipc_eif_preds

if ipc_fit_type = "external", the fitted values from a regression of the full-data EIF on the fully observed covariates/outcome; otherwise, not used.

ipc_est_type

IPC correction, either "ipw" (for classical inverse probability weighting) or "aipw" (for augmented inverse probability weighting; the default).

scale

if doing an IPC correction, then the scale that the correction should be computed on (e.g., "identity"; or "logit" to logit-transform, apply the correction, and back-transform).

na.rm

logical; should NA's be removed in computation? (defaults to FALSE)

...

other arguments to SuperLearner, if ipc_fit_type = "SL".

Details

See the paper by Williamson, Gilbert, Simon, and Carone for more details on the mathematics behind this function and the definition of the parameter of interest. If sample-splitting is also requested (recommended, since in this case inferences will be valid even if the variable has zero true importance), then the prediction functions are trained as if 2K-fold cross-validation were run, but are evaluated on only K sets (independent between the full and reduced nuisance regression).

Value

The estimated measure of predictiveness.


bdwilliamson/npvi documentation built on Feb. 1, 2024, 10:46 p.m.