Description Usage Arguments Details Value References Examples
Assessing balance between exposure(s) and confounders is key when performing causal analysis using propensity scores. We provide a list of several models to generate weights to use in causal inference for multivariate exposures, and test the balancing property of these weights using weighted Pearson correlations. In addition, returns the effective sample size.
1 2 3 4 5 6 7 8 9 10 |
model_list |
character string identifying which methods to use when constructing weights. See details for a list of available models |
D |
numeric matrix of dimension n by m designating values of the exposures |
C |
either a list of numeric matrices of length m of dimension
n by p_j designating values of the confounders for each exposure
value or if |
common |
logical indicator for whether C is a single matrix of common confounders for all exposures. default is FALSE meaning C must be specified as list of confounders of length m. |
trim_w |
logical indicator for whether to trim weights. default is FALSE |
trim_quantile |
numeric scalar used to specify the upper quantile to trim weights if applicable. default is 0.99 |
all_uni |
logical indicator. If TRUE then all univariate models specified in model_list will be estimated for each exposure. If FALSE will only estimate weights for the first exposure |
... |
additional arguments to pass to |
When using propensity score methods for causal inference it is crucial to check the balancing property of the covariates and exposure(s). To do this in the multivariate case we first use a weight generating method from the available list shown below.
"mvGPS": Multivariate generalized propensity score using Gaussian densities
"entropy": Estimating weights using entropy loss function without specifying propensity score \insertCitetbbicke2020entropymvGPS
"CBPS": Covariate balancing propensity score for continuous treatments which adds balance penalty while solving for propensity score parameters \insertCitefong2018mvGPS
"PS": Generalized propensity score estimated using univariate Gaussian densities
"GBM": Gradient boosting to estimate the mean function of the propensity score, but still maintains Gaussian distributional assumptions \insertCitezhu_boostingmvGPS
Note that only the mvGPS
method is multivariate and all others are strictly univariate.
For univariate methods weights are estimated for each exposure separately
using the weightit
function given the
confounders for that exposure in C
when all_uni=TRUE
. To estimate
weights for only the first exposure set all_uni=FALSE
.
It is also important to note that the weights for each method can be trimmed at
the desired quantile by setting trim_w=TRUE
and setting trim_quantile
in \[0.5, 1\]. Trimming is done at both the upper and lower bounds. For further details
see mvGPS
on how trimming is performed.
In this package we include three key balancing metrics to summarize balance across all of the exposures.
Euclidean distance
Maximum absolute correlation
Average absolute correlation
Euclidean distance is calculated using the origin point as reference, e.g. for m=2
exposures the reference point is \[0, 0\]. In this way we are calculating how far
the observed set of correlation points are from perfect balance.
Maximum absolute correlation reports the largest single imbalance between the exposures and the set of confounders. It is often a key diagnostic as even a single confounder that is sufficiently out of balance can reduce performance.
Average absolute correlation is the sum of the exposure-confounder correlations. This metric summarizes how well, on average, the entire set of exposures is balanced.
Effective sample size, ESS, is defined as
ESS=(Σ_i w_i)^{2}/Σ_i w_i^2,
where w_i are the estimated weights for a particular method \insertCitekish_essmvGPS. Note that when w=1 for all units that the ESS is equal to the sample size n. ESS decreases when there are extreme weights or high variability in the weights.
W
: list of weights generated for each model
cor_list
: list of weighted Pearson correlation coefficients for all confounders specified
bal_metrics
: data.frame with the Euclidean distance, maximum absolute correlation, and average absolute correlation by method
ess
: effective sample size for each of the methods used to generate weights
models
: vector of models used
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | #simulating data
sim_dt <- gen_D(method="u", n=150, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2,
k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0),
d2_beta=c(0, 0.3, 0.75), seed=06112020)
D <- sim_dt$D
C <- sim_dt$C
#generating weights using mvGPS and potential univariate alternatives
require(WeightIt)
bal_sim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D,
C=list(C[, 1:2], C[, 2:3]))
#overall summary statistics
bal_sim$bal_metrics
#effective sample sizes
bal_sim$ess
#we can also trim weights for all methods
bal_sim_trim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D,
C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE, trim_quantile=0.9, p.mean=0.5)
#note that in this case we can also pass additional arguments using in
#WeighIt package for entropy, CBPS, PS, and GBM such as specifying the p.mean
#can check to ensure all the weights have been properly trimmed at upper and
#lower bound
all.equal(unname(unlist(lapply(bal_sim$W, quantile, 0.99))),
unname(unlist(lapply(bal_sim_trim$W, max))))
all.equal(unname(unlist(lapply(bal_sim$W, quantile, 1-0.99))),
unname(unlist(lapply(bal_sim_trim$W, min))))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.