plateau_selector: IPW Estimator Selector Using Lepski's Plateau Method for the...

Description Usage Arguments

View source: R/selector_plateau.R

Description

IPW Estimator Selector Using Lepski's Plateau Method for the MSE

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
plateau_selector(
  W,
  A,
  Y,
  delta = 0,
  gn_pred_natural,
  gn_pred_shifted,
  gn_fit_haldensify,
  Qn_pred_natural,
  Qn_pred_shifted,
  cv_folds = 10L,
  gcv_mult = 50L,
  bootstrap = FALSE,
  n_boot = 1000L,
  ...
)

Arguments

W

A matrix, data.frame, or similar containing a set of baseline covariates.

A

A numeric vector corresponding to a exposure variable. The parameter of interest is defined as a location shift of this quantity.

Y

A numeric vector of the observed outcomes.

delta

A numeric value indicating the shift in the exposure to be used in defining the target parameter. This is defined with respect to the scale of the exposure (A).

gn_pred_natural

A matrix of conditional density estimates of the exposure mechanism g(A|W) along a grid of the regularization parameter, at the natural (observed, actual) values of the exposure.

gn_pred_shifted

A matrix of conditional density estimates of the exposure mechanism g(A+delta|W) along a grid of the regularization parameter, at the shifted (counterfactual) values of the exposure.

gn_fit_haldensify

An object of class haldensify of the fitted conditional density model for the natural exposure mechanism. This should be the fit object returned by haldensify[haldensify] as part of a call to ipw_shift.

Qn_pred_natural

A numeric of the outcome mechanism estimate at the natural (i.e., observed) values of the exposure. HAL regression is used for the estimate, with the regularization term chosen by cross-validation.

Qn_pred_shifted

A numeric of the outcome mechanism estimate at the shifted (i.e., counterfactual) values of the exposure. HAL regression is used for the estimate, with the regularization term chosen by cross-validation.

cv_folds

A numeric giving the number of folds to be used for cross-validation. Note that this form of sample splitting is used for the selection of tuning parameters by empirical risk minimization, not for the estimation of nuisance parameters (i.e., to relax regularity conditions).

gcv_mult

TODO

bootstrap

A logical indicating whether the estimator variance should be approximated using the nonparametric bootstrap. The default is FALSE, in which case the empirical variances of the IPW estimating function and the EIF are used for for estimator selection and for variance estimation, respectively. When set to TRUE, the bootstrap variance is used for both of these purposes instead. Note that the bootstrap is very computationally intensive and scales relatively poorly.

n_boot

A numeric giving the number of bootstrap re-samples to be used in computing the plateau estimator selection criterion. The default uses 1000 bootstrap samples, though it may be appropriate to use fewer such samples for experimentation purposes. This is ignored when bootstrap is set to FALSE (its default).

...

Additional arguments for model fitting to be passed directly to haldensify.


haldensify documentation built on Feb. 10, 2022, 1:07 a.m.