mn.regu.cv: Model-assisted inference for population means based on cross...

Description Usage Arguments Details Value References Examples

View source: R/regu-est-c.r

Description

This function implements model-assisted inference for population means with missing data, using regularized calibrated estimation based on cross validation.

Usage

1
2
mn.regu.cv(fold, nrho = NULL, rho.seq = NULL, y, tr, x, ploss = "cal",
  yloss = "gaus", off = 0, ...)

Arguments

fold

A vector of length 2 giving the fold numbers for cross validation in propensity score estimation and outcome regression respectively.

nrho

A vector of length 2 giving the numbers of tuning parameters searched in cross validation.

rho.seq

A list of two vectors giving the tuning parameters in propensity score estimation (first vector) and outcome regression (second vector).

y

An n x 1 vector of outcomes with missing data.

tr

An n x 1 vector of non-missing indicators (=1 if y is observed or 0 if y is missing).

x

An n x p matix of covariates, used in both propensity score and outcome regression models.

ploss

A loss function used in propensity score estimation (either "ml" or "cal").

yloss

A loss function used in outcome regression (either "gaus" for continuous outcomes or "ml" for binary outcomes).

off

An offset value (e.g., the true value in simulations) used to calculate the z-statistic from augmented IPW estimation.

...

Additional arguments to glm.regu.cv.

Details

Two steps are involved in this function: first fitting propensity score and outcome regression models and then applying the augmented IPW estimator for a population mean. For ploss="cal", regularized calibrated estimation is performed with cross validation as described in Tan (2020a, 2020b). The method then leads to model-assisted inference, in which confidence intervals are valid with high-dimensinoal data if the propensity score model is correctly specified but the outcome regression model may be misspecified. With linear outcome models, the inference is also doubly robust. For ploss="ml", regularized maximum likelihood estimation is used (Belloni et al. 2014; Farrell 2015). In this case, standard errors are only shown to be valid if both the propensity score model and the outcome regression model are correctly specified.

Value

ps

A list containing the results from fitting the propensity score model by glm.regu.cv.

fp

The n x 1 vector of fitted propensity scores.

or

A list containing the results from fitting the outcome regression model by glm.regu.cv.

fo

The n x 1 vector of fitted values from outcome regression.

est

A list containing the results from augmented IPW estimation by mn.aipw.

References

Belloni, A., Chernozhukov, V., and Hansen, C. (2014) Inference on treatment effects after selection among high-dimensional controls, Review of Economic Studies, 81, 608-650.

Farrell, M.H. (2015) Robust inference on average treatment effects with possibly more covariates than observations, Journal of Econometrics, 189, 1-23.

Tan, Z. (2020a) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 107, 137<e2><80><93>158.

Tan, Z. (2020b) Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 48, 811<e2><80><93>837.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
data(simu.data)
n <- dim(simu.data)[1]
p <- dim(simu.data)[2]-2

y <- simu.data[,1]
tr <- simu.data[,2]
x <- simu.data[,2+1:p]
x <- scale(x)

# missing data
y[tr==0] <- NA

mn.cv.rcal <- mn.regu.cv(fold=5*c(1,1), nrho=(1+10)*c(1,1), rho.seq=NULL, y, tr, x, 
                         ploss="cal", yloss="gaus")
unlist(mn.cv.rcal$est)

RCAL documentation built on Nov. 8, 2020, 4:22 p.m.