matchthem: Matches Multiply Imputed Datasets

matchthemR Documentation

Matches Multiply Imputed Datasets

Description

matchthem() performs matching in the supplied multiply imputed datasets, given as mids or amelia objects, by running MatchIt::matchit() on each of the multiply imputed datasets with the supplied arguments.

Usage

matchthem(
  formula,
  datasets,
  approach = "within",
  method = "nearest",
  distance = "glm",
  link = "logit",
  distance.options = list(),
  discard = "none",
  reestimate = FALSE,
  ...
)

Arguments

formula

A formula of the form z ~ x1 + x2, where z is the exposure and x1 and x2 are the covariates to be balanced, which is passed directly to MatchIt::matchit() to specify the propensity score model or treatment and covariates to be used in matching. See MatchIt::matchit() for details.

datasets

This argument specifies the datasets containing the exposure and the potential confounders called in the formula. This argument must be an object of the mids or amelia class, which is typically produced by a previous call to mice() function from the mice package or to amelia() function from the Amelia package (the Amelia package is designed to impute missing data in a single cross-sectional dataset or in a time-series dataset, currently, the MatchThem package only supports the former datasets).

approach

The approach that should be used to combine information in multiply imputed datasets. Currently, "within" (performing matching within each dataset) and "across" (estimating propensity scores within each dataset, averaging them across datasets, and performing matching using the averaged propensity scores in each dataset) approaches are available. The default is "within", which has been shown to have superior performance in most cases.

method

This argument specifies a matching method. Currently, "nearest" (nearest neighbor matching), "exact" (exact matching), "full" (optimal full matching), "genetic" (genetic matching), "subclass" (subclassication), "cem" (coarsened exact matching), "optimal" (optimal pair matching), "quick" (generalized full matching), and ("cardinality") (cardinality and profile matching) methods are available. Only methods that produce a propensity score ("nearest", "full", "genetic", "subclass", "optimal", and "quick") are compatible with the "across" approach. The default is "nearest" for nearest neighbor matching. See MatchIt::matchit() for details.

distance

The method used to estimate the distance measure (e.g., propensity scores) used in matching, if any. Only options that specify a method of estimating propensity scores (i.e., not "mahalanobis") are compatible with the "across" approach. The default is "glm" for estimating propensity scores using logistic regression. See MatchIt::matchit() and MatchIt::distance for details and allowable options.

link, distance.options, discard, reestimate

Arguments passed to MatchIt::matchit() to control estimation of the distance measure (e.g., propensity scores).

...

Additional arguments passed to MatchIt::matchit().

Details

If an amelia object is supplied to datasets, it will be transformed into a mids object for further use. matchthem() works by calling mice::complete() on the mids object to extract a complete dataset, and then calls MatchIt::matchit() on each one, storing the output of each matchit() call and the mids in the output. All arguments supplied to matchthem() except datasets and approach are passed directly to matchit(). With the "across" approach, the estimated propensity scores are averaged across multiply imputed datasets and re-supplied to another set of calls to matchit().

Value

An object of the mimids() (matched multiply imputed datasets) class, which includes the supplied mids object (or an amelia object transformed into a mids object if supplied) and the output of the calls to matchit() on each multiply imputed dataset.

Author(s)

Farhad Pishgar and Noah Greifer

References

Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(3): 199-236. https://gking.harvard.edu/files/abs/matchp-abs.shtml

Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3): 1-67. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v045.i03")}

Gary King, James Honaker, Anne Joseph, and Kenneth Scheve (2001). Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation. American Political Science Review, 95: 49–69. https://gking.harvard.edu/files/abs/evil-abs.shtml

See Also

mimids

with()

pool()

weightthem()

MatchIt::matchit()

Examples

#1

#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- mice::mice(osteoarthritis, m = 5)

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK,
                              imputed.datasets,
                              approach = 'within',
                              method = 'nearest')

#2

#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5,
                                   noms = c("SEX", "RAC", "SMK", "OSP", "KOA"))

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK,
                              imputed.datasets,
                              approach = 'across',
                              method = 'nearest')

MatchThem documentation built on May 29, 2024, 6:24 a.m.