matchthem: Matches Multiply Imputed Datasets

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

matchthem() performs matching in the supplied imputed datasets, given as mids or amelia objects, by running MatchIt::matchit() on each of the imputed datasets with the supplied arguments.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
matchthem(
  formula,
  datasets,
  approach = "within",
  method = "nearest",
  distance = "glm",
  link = "logit",
  distance.options = list(),
  discard = "none",
  reestimate = FALSE,
  ...
)

Arguments

formula

A formula of the form z ~ x1 + x2, where z is the exposure and x1 and x2 are the covariates to be balanced, which is passed directly to MatchIt::matchit() to specify the propensity score model or treatment and covariates to be used in matching. See matchit() for details.

datasets

This argument specifies the datasets containing the exposure indicator and the potential confounders called in the formula. This argument must be an object of the mids or amelia class, which is typically produced by a previous call to mice() from the mice package or to amelia() from the Amelia package (the Amelia package is designed to impute missing data in a single cross-sectional dataset or in a time-series dataset, currently, the MatchThem package only supports the former datasets).

approach

The approach used to combine information across imputed datasets. Currently, "within" (performing matching within each imputed dataset) and "across" (estimating propensity scores within each dataset, averaging them across datasets, and performing matching on the averaged propensity scores in each dataset) approaches are available. The default is "within", which has been shown to have superior performance in most cases.

method

This argument specifies a matching method. Currently, "nearest" (nearest neighbor matching), "exact" (exact matching), "full" (full matching), "genetic" (genetic matching), "subclass" (subclassication), "cem" (coarsened exact matching), and "optimal" (optimal matching) methods are available. Only methods that produce a propensity score ("nearest", "full", "genetic", "subclass", and "optimal") are compatible with the "across" approach. The default is "nearest" for nearest neighbor matching. See matchit() for details.

distance

The method used to estimate the distance measure (e.g., propensity scores) used in matching, if any. Only options that specify a method of estimating propensity scores (i.e., not "mahalanobis") are compatible with the "across" approach. The default is "glm" for propensity scores estimating using logistic regression. See matchit() and distance for details and allowable options.

link, distance.options, discard, reestimate

Arguments passed to matchit() to control estimation of the distance measure (e.g., propensity scores).

...

Additional arguments passed to matchit().

Details

If an amelia object is supplied to datasets, it will first be transformed into a mids object for further use. matchthem() works by calling mice::complete() on the mids object to extract a complete dataset, and then calls MatchIt::matchit() on each one, storing the output of each matchit() call and the mids in the output. All arguments supplied to matchthem() except datasets and approach are passed directly to matchit(). With the "across" method, the estimated propensity scores are averaged across imputations and re-supplied to another set of calls to matchit().

Value

An object of the mimids (matched multiply imputed datasets) class, which includes the supplied mids object (or an amelia object transformed into a mids object if supplied) and the output of the calls to matchit() on each imputed dataset.

Author(s)

Farhad Pishgar and Noah Greifer

References

Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(3): 199-236. https://gking.harvard.edu/files/abs/matchp-abs.shtml

Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3): 1-67. https://www.jstatsoft.org/v045/i03/

Gary King, James Honaker, Anne Joseph, and Kenneth Scheve (2001). Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation. American Political Science Review, 95: 49–69. https://gking.harvard.edu/files/abs/evil-abs.shtml

See Also

mimids

with

pool

weightthem

MatchIt::matchit

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#1

#Loading libraries
library(MatchThem)

#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- mice::mice(osteoarthritis, m = 5)

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK,
                              imputed.datasets,
                              approach = 'within',
                              method = 'nearest')

#2

#Loading libraries
library(MatchThem)

#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5,
                                   noms = c("SEX", "RAC", "SMK", "OSP", "KOA"))

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK,
                              imputed.datasets,
                              approach = 'across',
                              method = 'nearest')

MatchThem documentation built on Aug. 23, 2021, 9:16 a.m.