balqual | R Documentation |
The balqual()
function evaluates the balance quality of a
dataset after matching, comparing it to the original unbalanced dataset. It
computes various summary statistics and provides an easy interpretation
using user-specified cutoff values.
balqual(
matched_data = NULL,
formula = NULL,
type = c("smd", "r", "var_ratio"),
statistic = c("mean", "max"),
cutoffs = NULL,
round = 3
)
matched_data |
An object of class |
formula |
A valid R formula used to compute generalized propensity
scores during the first step of the vector matching algorithm in
|
type |
A character vector specifying the quality metrics to calculate.
Can maximally contain 3 values in a vector created by the
|
statistic |
A character vector specifying the type of statistics used to summarize the quality metrics. Since quality metrics are calculated for all pairwise comparisons between treatment levels, they need to be aggregated for the entire dataset.
To compute both, provide both names using the |
cutoffs |
A numeric vector with the same length as the number of
coefficients specified in the |
round |
An integer specifying the number of decimal places to round the output to. |
If assigned to a name, returns a list of summary statistics of class
quality
containing:
quality_mean
- A data frame with the mean values of the statistics
specified in the type
argument for all balancing variables used in
formula
.
quality_max
- A data frame with the maximal values of the statistics
specified in the type
argument for all balancing variables used in
formula
.
perc_matched
- A single numeric value indicating the percentage of
observations in the original dataset that were matched.
statistic
- A single string defining which statistic will be displayed
in the console.
summary_head
- A summary of the matching process. If max
is included
in the statistic
, it contains the maximal observed values for each
variable; otherwise, it includes the mean values.
n_before
- The number of observations in the dataset before matching.
n_after
- The number of observations in the dataset after matching.
count_table
- A contingency table showing the distribution of the
treatment variable before and after matching.
The balqual()
function also prints a well-formatted table with the
defined summary statistics for each variable in the formula
to the
console.
Rubin, D.B. Using Propensity Scores to Help Design Observational Studies: Application to the Tobacco Litigation. Health Services & Outcomes Research Methodology 2, 169–188 (2001). https://doi.org/10.1023/A:1020363010465
Michael J. Lopez, Roee Gutman "Estimation of Causal Effects with Multiple Treatments: A Review and New Ideas," Statistical Science, Statist. Sci. 32(3), 432-454, (August 2017)
match_gps()
for matching the generalized propensity scores;
estimate_gps()
for the documentation of the formula
argument.
# We try to balance the treatment variable in the cancer dataset based on age
# and sex covariates
data(cancer)
# Firstly, we define the formula
formula_cancer <- formula(status ~ age * sex)
# Then we can estimate the generalized propensity scores
gps_cancer <- estimate_gps(formula_cancer,
cancer,
method = "multinom",
reference = "control",
verbose_output = TRUE
)
# ... and drop observations based on the common support region...
csr_cancer <- csregion(gps_cancer)
# ... to match the samples using `match_gps()`
matched_cancer <- match_gps(csr_cancer,
reference = "control",
caliper = 1,
kmeans_cluster = 5,
kmeans_args = list(n.iter = 100),
verbose_output = TRUE
)
# At the end we can assess the quality of matching using `balqual()`
balqual(
matched_data = matched_cancer,
formula = formula_cancer,
type = "smd",
statistic = "max",
round = 3,
cutoffs = 0.2
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.