mn.regu.cv: Model-assisted inference for population means based on cross...
In RCAL: Regularized Calibrated Estimation

Description Usage Arguments Details Value References Examples

This function implements model-assisted inference for population means with missing data, using regularized calibrated estimation based on cross validation.

1 2	mn.regu.cv(fold, nrho = NULL, rho.seq = NULL, y, tr, x, ploss = "cal", yloss = "gaus", off = 0, ...)

`fold`	A vector of length 2 giving the fold numbers for cross validation in propensity score estimation and outcome regression respectively.
`nrho`	A vector of length 2 giving the numbers of tuning parameters searched in cross validation.
`rho.seq`	A list of two vectors giving the tuning parameters in propensity score estimation (first vector) and outcome regression (second vector).
`y`	An n x 1 vector of outcomes with missing data.
`tr`	An n x 1 vector of non-missing indicators (=1 if `y` is observed or 0 if `y` is missing).
`x`	An n x p matix of covariates, used in both propensity score and outcome regression models.
`ploss`	A loss function used in propensity score estimation (either "ml" or "cal").
`yloss`	A loss function used in outcome regression (either "gaus" for continuous outcomes or "ml" for binary outcomes).
`off`	An offset value (e.g., the true value in simulations) used to calculate the z-statistic from augmented IPW estimation.
`...`	Additional arguments to `glm.regu.cv`.

Two steps are involved in this function: first fitting propensity score and outcome regression models and then applying the augmented IPW estimator for a population mean. For ploss="cal", regularized calibrated estimation is performed with cross validation as described in Tan (2020a, 2020b). The method then leads to model-assisted inference, in which confidence intervals are valid with high-dimensinoal data if the propensity score model is correctly specified but the outcome regression model may be misspecified. With linear outcome models, the inference is also doubly robust. For ploss="ml", regularized maximum likelihood estimation is used (Belloni et al. 2014; Farrell 2015). In this case, standard errors are only shown to be valid if both the propensity score model and the outcome regression model are correctly specified.

`ps`	A list containing the results from fitting the propensity score model by `glm.regu.cv`.
`fp`	The n x 1 vector of fitted propensity scores.
`or`	A list containing the results from fitting the outcome regression model by `glm.regu.cv`.
`fo`	The n x 1 vector of fitted values from outcome regression.
`est`	A list containing the results from augmented IPW estimation by `mn.aipw`.

Belloni, A., Chernozhukov, V., and Hansen, C. (2014) Inference on treatment effects after selection among high-dimensional controls, Review of Economic Studies, 81, 608-650.

Farrell, M.H. (2015) Robust inference on average treatment effects with possibly more covariates than observations, Journal of Econometrics, 189, 1-23.

Tan, Z. (2020a) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data, Biometrika, 107, 137<e2><80><93>158.

Tan, Z. (2020b) Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 48, 811<e2><80><93>837.

data(simu.data)
n <- dim(simu.data)[1]
p <- dim(simu.data)[2]-2

y <- simu.data[,1]
tr <- simu.data[,2]
x <- simu.data[,2+1:p]
x <- scale(x)

# missing data
y[tr==0] <- NA

mn.cv.rcal <- mn.regu.cv(fold=5*c(1,1), nrho=(1+10)*c(1,1), rho.seq=NULL, y, tr, x, 
                         ploss="cal", yloss="gaus")
unlist(mn.cv.rcal$est)