pairmatch | R Documentation |

Given a treatment group, a larger control reservoir, and a method for creating discrepancies between each treatment and control unit (or optionally an already created such discrepancy matrix), finds a pairing of treatment units to controls that minimizes the sum of discrepancies.

pairmatch(x, controls = 1, data = NULL, remove.unmatchables = FALSE, ...) pair(x, controls = 1, data = NULL, remove.unmatchables = FALSE, ...)

`x` |
Any valid input to Alternatively, a precomputed distance may be entered. |

`controls` |
The number of controls to be matched to each treatment |

`data` |
Optional data set. |

`remove.unmatchables` |
Should treatment group members for which there are no eligible controls be removed prior to matching? |

`...` |
Additional arguments to pass to |

This is a wrapper to `fullmatch`

; see its documentation for more
information, especially on additional arguments to pass, additional discussion
of valid input for parameter `x`

, and feasibility recovery.

If `remove.unmatchables`

is `FALSE`

, then if there are unmatchable
treated units then the matching as a whole will fail and no units will be
matched. If `TRUE`

, then this unit will be removed and the function will
attempt to match each of the other treatment units. As of version 0.9-8,
if there are fewer matchable treated units than matchable controls then
`pairmatch`

will attempt to place each into a matched pair each of the
matchable controls and a strict subset of the matchable treated units.
(Previously matching would have failed for subclasses of this structure.)

Matching can still fail,
even with `remove.unmatchables`

set to `TRUE`

,
if there is too much competition for certain controls; if you
find yourself in that situation you should consider full matching, which
necessarily finds a match for everyone with an eligible match somewhere.

The units of the `optmatch`

object returned correspond to members of the
treatment and control groups in reference to which the matching problem was
posed, and are named accordingly; the names are taken from the row and column
names of `distance`

(with possible additions from the optional
`data`

argument). Each element of the vector is the concatenation of:
(i) a character abbreviation of `subclass.indices`

, if that argument was
given, or the string '`m`

' if it was not; (ii) the string `.`

; and
(iii) a non-negative integer. Unmatched units have `NA`

entries.
Secondarily, `fullmatch`

returns various data about the matching process
and its result, stored as attributes of the named vector which is its primary
output. In particular, the `exceedances`

attribute gives upper bounds,
not necessarily sharp, for the amount by which the sum of distances between
matched units in the result of `fullmatch`

exceeds the least possible sum
of distances between matched units in a feasible solution to the matching
problem given to `fullmatch`

. (Such a bound is also printed by
`print.optmatch`

and by `summary.optmatch`

.)

A `optmatch`

object (`factor`

) indicating matched groups.

Hansen, B.B. and Klopfer, S.O. (2006), ‘Optimal full matching
and related designs via network flows’, *Journal of Computational
and Graphical Statistics*, **15**, 609–627.

`matched`

, `caliper`

, `fullmatch`

data(nuclearplants) ### Pair matching on a Mahalanobis distance ( pm1 <- pairmatch(pr ~ t1 + t2, data = nuclearplants) ) summary(pm1) ### Pair matching within a propensity score caliper. ppty <- glm(pr ~ . - (pr + cost), family = binomial(), data = nuclearplants) ### For more complicated models, create a distance matrix and pass it to fullmatch. mhd <- match_on(pr ~ t1 + t2, data = nuclearplants) + caliper(match_on(ppty), 2) ( pm2 <- pairmatch(mhd, data = nuclearplants) ) summary(pm2) ### Propensity balance assessment. Requires RItools package. if(require(RItools)) summary(pm2, ppty) ### 1:2 matched triples ( tm <- pairmatch(pr ~ t1 + t2, controls = 2, data = nuclearplants) ) summary(tm) ### Creating a data frame with the matched sets attached. ### match_on(), caliper() and the like cooperate with pairmatch() ### to make sure observations are in the proper order: all.equal(names(tm), row.names(nuclearplants)) ### So our data frame including the matched sets is just cbind(nuclearplants, matches=tm) ### In contrast, if your matching distance is an ordinary matrix ### (as earlier versions of optmatch required), you'll ### have to align it by observation name with your data set. cbind(nuclearplants, matches = tm[row.names(nuclearplants)]) ### Match in subgroups only. There are a few ways to specify this. m1 <- pairmatch(pr ~ t1 + t2, data=nuclearplants, within=exactMatch(pr ~ pt, data=nuclearplants)) m2 <- pairmatch(pr ~ t1 + t2 + strata(pt), data=nuclearplants) ### Matching on propensity scores within matching in subgroups only: m3 <- pairmatch(glm(pr ~ t1 + t2, data=nuclearplants, family=binomial), data=nuclearplants, within=exactMatch(pr ~ pt, data=nuclearplants)) m4 <- pairmatch(glm(pr ~ t1 + t2 + pt, data=nuclearplants, family=binomial), data=nuclearplants, within=exactMatch(pr ~ pt, data=nuclearplants)) m5 <- pairmatch(glm(pr ~ t1 + t2 + strata(pt), data=nuclearplants, family=binomial), data=nuclearplants) # Including `strata(foo)` inside a glm uses `foo` in the model as # well, so here m4 and m5 are equivalent. m3 differs in that it does # not include `pt` in the glm.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.