matchthem: Matches Multiply Imputed Datasets
In MatchThem: Matching and Weighting Multiply Imputed Datasets

matchthem

R Documentation

Matches Multiply Imputed Datasets

Description

matchthem() performs matching in the supplied multiply imputed datasets, given as mids or amelia objects, by running MatchIt::matchit() on each of the multiply imputed datasets with the supplied arguments.

Usage

matchthem(
  formula,
  datasets,
  approach = "within",
  method = "nearest",
  distance = "glm",
  link = "logit",
  distance.options = list(),
  discard = "none",
  reestimate = FALSE,
  ...
)

Arguments

`formula`	A `formula` of the form `z ~ x1 + x2`, where `z` is the exposure and `x1` and `x2` are the covariates to be balanced, which is passed directly to `MatchIt::matchit()` to specify the propensity score model or treatment and covariates to be used in matching. See `MatchIt::matchit()` for details.
`datasets`	This argument specifies the datasets containing the exposure and the potential confounders called in the `formula`. This argument must be an object of the `mids` or `amelia` class, which is typically produced by a previous call to `mice()` function from the mice package or to `amelia()` function from the Amelia package (the Amelia package is designed to impute missing data in a single cross-sectional dataset or in a time-series dataset, currently, the MatchThem package only supports the former datasets).
`approach`	The approach that should be used to combine information in multiply imputed datasets. Currently, `"within"` (performing matching within each dataset) and `"across"` (estimating propensity scores within each dataset, averaging them across datasets, and performing matching using the averaged propensity scores in each dataset) approaches are available. The default is `"within"`, which has been shown to have superior performance in most cases.
`method`	This argument specifies a matching method. Currently, `"nearest"` (nearest neighbor matching), `"exact"` (exact matching), `"full"` (optimal full matching), `"genetic"` (genetic matching), `"subclass"` (subclassication), `"cem"` (coarsened exact matching), `"optimal"` (optimal pair matching), `"quick"` (generalized full matching), and `("cardinality")` (cardinality and profile matching) methods are available. Only methods that produce a propensity score (`"nearest"`, `"full"`, `"genetic"`, `"subclass"`, `"optimal"`, and `"quick"`) are compatible with the `"across"` approach. The default is `"nearest"` for nearest neighbor matching. See `MatchIt::matchit()` for details.
`distance`	The method used to estimate the distance measure (e.g., propensity scores) used in matching, if any. Only options that specify a method of estimating propensity scores (i.e., not `"mahalanobis"`) are compatible with the `"across"` approach. The default is `"glm"` for estimating propensity scores using logistic regression. See `MatchIt::matchit()` and `MatchIt::distance` for details and allowable options.
`link`, `distance.options`, `discard`, `reestimate`	Arguments passed to `MatchIt::matchit()` to control estimation of the distance measure (e.g., propensity scores).
`...`	Additional arguments passed to `MatchIt::matchit()`.

Details

If an amelia object is supplied to datasets, it will be transformed into a mids object for further use. matchthem() works by calling mice::complete() on the mids object to extract a complete dataset, and then calls MatchIt::matchit() on each one, storing the output of each matchit() call and the mids in the output. All arguments supplied to matchthem() except datasets and approach are passed directly to matchit(). With the "across" approach, the estimated propensity scores are averaged across multiply imputed datasets and re-supplied to another set of calls to matchit().

Value

An object of the mimids() (matched multiply imputed datasets) class, which includes the supplied mids object (or an amelia object transformed into a mids object if supplied) and the output of the calls to matchit() on each multiply imputed dataset.

Author(s)

Farhad Pishgar and Noah Greifer

References

Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(3): 199-236. https://gking.harvard.edu/files/abs/matchp-abs.shtml

Stef van Buuren and Karin Groothuis-Oudshoorn (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3): 1-67. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v045.i03")}

Gary King, James Honaker, Anne Joseph, and Kenneth Scheve (2001). Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation. American Political Science Review, 95: 49–69. https://gking.harvard.edu/files/abs/evil-abs.shtml

Examples

#1

#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- mice::mice(osteoarthritis, m = 5)

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK,
                              imputed.datasets,
                              approach = 'within',
                              method = 'nearest')

#2

#Loading the dataset
data(osteoarthritis)

#Multiply imputing the missing values
imputed.datasets <- Amelia::amelia(osteoarthritis, m = 5,
                                   noms = c("SEX", "RAC", "SMK", "OSP", "KOA"))

#Matching the multiply imputed datasets
matched.datasets <- matchthem(OSP ~ AGE + SEX + BMI + RAC + SMK,
                              imputed.datasets,
                              approach = 'across',
                              method = 'nearest')

MatchThem documentation built on May 29, 2024, 6:24 a.m.