Description Usage Arguments Details Value Author(s) References See Also Examples
nawt
estimates a prespecified parameter of interest (e.g., the
average treatment effects (ATE) or the average treatment effects on the
treated (ATT)) with the inverse probability weighting where propensity scores
are estimated using estimating equations suitable for the parameter of
interest. It includes the covariate balancing propensity score proposed by
Imai and Ratkovic (2014), which uses covariate balancing conditions in
propensity score estimation. nawt
can also be used to estimate average
outcomes in missing outcome cases.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 
formula 
an object of class 
outcome 
a character string specifying the name of outcome values
in 
estimand 
a character string specifying a parameter of interest. Choose "ATT" for the average treatment effects on the treated estimation, "ATE" for the average treatment effects estimation, "ATC" for the average outcomes estimation in missing outcome cases. You can choose "ATEcombined" for the combined estimation for the average treatment effects estimation. 
method 
a character string specifying a type of weighting functions in propensity score estimation (ω(π)). Choose "score" for a power function of propensity scores (need to specify the value for alpha), "cb" for a covariate balancing weighting function, or "both" to use both the above weighting functions (need to specify the value for alpha). 
data 
a data frame (or one that can be coerced to that class) containing the outcomes and the variables in the model. 
weights 
an optional vector of ‘prior weights’ (e.g. sampling weights) to be used in the fitting process. Should be NULL or a numeric vector. 
alpha 
a positive value for an exponent in a power weighting function
(ω(π) = π^α, in the ATT estimation, for example).
Default is 2. Set to 0 to use the standard logistic regression for
propensity score estimation. Note that 
twostep 
a logical value indicating whether to use a twostep estimator
when 
boot 
a logical value indicating whether to use a nonparametric
bootstrapping method to estimate the variancecovariance matrix and
confidence intervals for parameters. Default is 
B 
the number of bootstrap replicates. Default is 2,000. 
clevel 
confidence level. Default is 0.95. 
message 
a logical value indicating whether messages are shown or not. 
The treatment variable (or, missingness variable in missing outcome cases) must be binary and coded as 0 (for controlled or nonmissing observations) or 1 (for treated or missing observations).
When the data frame has incomplete cases, which have NAs for either of
the treatment variable, explanatory variables for propensity score
estimation, or the outcome variable, nawt
conducts listwise deletion.
Returned values (e.g., weights
, ps
, data
) do not contain
values for these deleted cases.
The parameter of interest is estimated by the Hajek estimator, where inverse probability weights are standardized to sum to 1 within each treatment group after being calculated as t_i / π_i  (1  t_i) / (1  π_i) for the ATE estimation, (t_i  π_i) / (1  π_i) for the ATT estimation, (t_i  π_i) / π_i for the ATC estimation, and (1  t_i) / (1  π_i) for the missing outcome cases.
For the ATE estimation, it is recommended to specify the estimand
as
"ATE"
but you may specify it as "ATEcombined"
. The former utilizes
the separated estimation whereas the latter utilizes the combined estimation,
and the former should produce smaller biases and variances. Note that the
former estimates two propensity scores for each observation by estimating two
propensity score functions with different estimating equations.
When a twostep estimator is used in nawt
with method = "both"
,
scratio
(r) is calculated in the first step. scratio
is a
ratio of accuracy in propensity score estimation in the NAWT with a power
weighting function with a specified alpha
to that with a covariate balancing
weighting function. It determines the mixture weight in the second step, like
the weighting matrix in the twostep overidentified GMM estimation, where
weighted estimating equations of those with the power weighting function and
the covariate balancing function is used. This mixture weight is proportional
to the scratio
(e.g., ω(π) = r π^α + (1  r) / (1  π),
in the ATT estimation).
Since the NAWT utilizes weighted estimating equations in propensity score
estimation, it may sometimes become unstable especially when only a few
observations have extremely large weights in propensity score estimation.
nawt
generates a warning when the effective sample size for propensity
score estimation is smaller than a quarter of the effective sample size with
the initial weights. In that case, carefully look at the estimated
coefficients to check whether the estimation fails or not and cbcheck
will be helpful.
nawt
returns an object of class inheriting from "nawt".
The function summary (i.e., summary.nawt
) can be used to obtain or print a
summary of the results.
An object of class "nawt" is a list containing the following components:
est 
the point estimate of the parameter of interest. 
weights 
the estimated inverse probability weights. 
ps 
the estimated propensity scores. A matrix of two sets of the
estimated propensity scores is returned when 
coefficients 
a named vector of coefficients. A matrix of two sets of
coefficients for two sets of propensity scores is returned when

varcov 
the variancecovariance matrix of the coefficients and parameter of interest. 
converged 
logical. Was the algorithm judged to have converged? 
naive_weights 
the estimated inverse probability weights with the standard logistic regression for the propensity score estimation. 
naive_coef 
a named vector of coefficients with the standard logistic regression for the propensity score estimation. 
scratio 
an optimal ratio of the covariate balancing weighting function
to the power weighting function in taking the weighted average weights for
the weighted score conditions when 
estimand 
the parameter of interest specified. 
method 
the method specified. 
outcome 
the outcome vector. 
alpha 
alpha specified. 
names.x 
names of the explanatory variables in propensity score estimation. 
prior.weights 
the weights initially supplied, a vector of 1s if none were. 
treat 
the treatment vector. The missingness vector when the missing outcome cases. 
ci 
a matrix of the confidence intervals for the parameter of interest. 
omega 
a vector of weights for the weighted score conditions (ω).
A matrix of two sets of omega is returned when 
effN_ps 
the effective sample size for the propensity score estimation.
A vector of length two for two propensity score estimation is returned when

effN_est 
the effective sample size for the parameter of interest estimation. 
effN_original 
the effective sample size with the initial weights. 
formula 
formula specified. 
call 
the matched call. 
data 
the data argument. 
Hiroto Katsumata
Imai, Kosuke and Marc Ratkovic. 2014. "Covariate Balancing Propensity Score." Journal of the Royal Statistical Society, Series B (Statistical Methodology) 76 (1): 243–63.
Christian Fong, Marc Ratkovic and Kosuke Imai (2019). CBPS: Covariate Balancing Propensity Score. R package version 0.21. https://CRAN.Rproject.org/package=CBPS
Katsumata, Hiroto. 2020. "Navigated Weighting to Improve Inverse Probability Weighting for Missing Data Problems and Causal Inference." arXiv preprint arXiv:2005.10998.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89  # Simulation from Kang and Shafer (2007) and Imai and Ratkovic (2014)
# ATT estimation
# True ATT is 10
tau < 10
set.seed(12345)
n < 1000
X < matrix(rnorm(n * 4, mean = 0, sd = 1), nrow = n, ncol = 4)
prop < 1 / (1 + exp(X[, 1]  0.5 * X[, 2] + 0.25 * X[, 3] + 0.1 * X[, 4]))
treat < rbinom(n, 1, prop)
y < 210 + 27.4 * X[, 1] + 13.7 * X[, 2] + 13.7 * X[, 3] + 13.7 * X[, 4] +
tau * treat + rnorm(n)
df < data.frame(X, treat, y)
colnames(df) < c("x1", "x2", "x3", "x4", "treat", "y")
# A misspecified model
Xmis < data.frame(x1mis = exp(X[, 1] / 2),
x2mis = X[, 2] * (1 + exp(X[, 1]))^(1) + 10,
x3mis = (X[, 1] * X[, 3] / 25 + 0.6)^3,
x4mis = (X[, 2] + X[, 4] + 20)^2)
# Data frame and formulas for propensity score estimation
df < data.frame(df, Xmis)
formula_c < as.formula(treat ~ x1 + x2 + x3 + x4)
formula_m < as.formula(treat ~ x1mis + x2mis + x3mis + x4mis)
# Correct propensity score model
# Power weighting function with alpha = 2
fits2c < nawt(formula = formula_c, outcome = "y", estimand = "ATT",
method = "score", data = df, alpha = 2)
summary(fits2c)
# Covariate balancing weighting function
fitcbc < nawt(formula = formula_c, outcome = "y", estimand = "ATT",
method = "cb", data = df)
summary(fitcbc)
# Standard logistic regression
fits0c < nawt(formula = formula_c, outcome = "y", estimand = "ATT",
method = "score", data = df, alpha = 0)
summary(fits0c)
# Misspecified propensity score model
# Power weighting function with alpha = 2
fits2m < nawt(formula = formula_m, outcome = "y", estimand = "ATT",
method = "score", data = df, alpha = 2)
summary(fits2m)
# Covariate balancing weighting function
fitcbm < nawt(formula = formula_m, outcome = "y", estimand = "ATT",
method = "cb", data = df)
summary(fitcbm)
# Standard logistic regression
fits0m < nawt(formula = formula_m, outcome = "y", estimand = "ATT",
method = "score", data = df, alpha = 0)
summary(fits0m)
# Empirical example
# Load the LaLonde data
data(LaLonde)
formula_l < as.formula("exper ~ age + I(age^2) + educ + I(educ^2) +
black + hisp + married + nodegr +
I(re75 / 1000) + I(re75 == 0) + I(re74 / 1000)")
# Experimental benchmark
mean(subset(LaLonde, exper == 1 & treat == 1)$re78) 
mean(subset(LaLonde, exper == 1 & treat == 0)$re78)
# Power weighting function with alpha = 2
fits2l < nawt(formula = formula_l, estimand = "ATT", method = "score",
outcome = "re78", data = LaLonde, alpha = 2)
mean(subset(LaLonde, exper == 1 & treat == 1)$re78) 
with(LaLonde, sum((1  exper) * re78 * fits2l$weights) /
sum((1  exper) * fits2l$weights))
# Covariate balancing weighting function
fitcbl < nawt(formula = formula_l, estimand = "ATT", method = "cb",
outcome = "re78", data = LaLonde)
mean(subset(LaLonde, exper == 1 & treat == 1)$re78) 
with(LaLonde, sum((1  exper) * re78 * fitcbl$weights) /
sum((1  exper) * fitcbl$weights))
# Standard logistic regression
fits0l < nawt(formula = formula_l, estimand = "ATT", method = "score",
outcome = "re78", data = LaLonde, alpha = 0)
mean(subset(LaLonde, exper == 1 & treat == 1)$re78) 
with(LaLonde, sum((1  exper) * re78 * fits0l$weights) /
sum((1  exper) * fits0l$weights))

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.