Optimal 1:1 and 1:k matching
Given a treatment group, a larger control reservoir, and a method for creating discrepancies between each treatment and control unit (or optionally an already created such discrepancy matrix), finds a pairing of treatment units to controls that minimizes the sum of discrepancies.
1 2 3
Any valid input to
Alternatively, a precomputed distance may be entered.
The number of controls to be matched to each treatment
Optional data set.
Should treatment group members for which there are no eligible controls be removed prior to matching?
Additional arguments to pass to
This is a wrapper to
fullmatch; see its documentation for more
information, especially on additional arguments to pass, additional discussion
of valid input for parameter
x, and feasibility recovery.
FALSE, then if there are unmatchable
treated units then the matching as a whole will fail and no units will be
TRUE, then this unit will be removed and the function will
attempt to match each of the other treatment units. (In this case matching
can still fail, if there is too much competition for certain controls; if you
find yourself in that situation you should consider full matching, which
necessarily finds a match for everyone with an eligible match somewhere.)
The units of the
optmatch object returned correspond to members of the
treatment and control groups in reference to which the matching problem was
posed, and are named accordingly; the names are taken from the row and column
distance (with possible additions from the optional
data argument). Each element of the vector is the concatenation of:
(i) a character abbreviation of
subclass.indices, if that argument was
given, or the string '
m' if it was not; (ii) the string
(iii) a non-negative integer. Unmatched units have
fullmatch returns various data about the matching process
and its result, stored as attributes of the named vector which is its primary
output. In particular, the
exceedances attribute gives upper bounds,
not necessarily sharp, for the amount by which the sum of distances between
matched units in the result of
fullmatch exceeds the least possible sum
of distances between matched units in a feasible solution to the matching
problem given to
fullmatch. (Such a bound is also printed by
print.optmatch and by
optmatch object (
factor) indicating matched groups.
Hansen, B.B. and Klopfer, S.O. (2006), ‘Optimal full matching and related designs via network flows’, Journal of Computational and Graphical Statistics, 15, 609–627.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
data(nuclearplants) ### Pair matching on a Mahalanobis distance ( pm1 <- pairmatch(pr ~ t1 + t2, data = nuclearplants) ) summary(pm1) ### Pair matching within a propensity score caliper. ppty <- glm(pr ~ . - (pr + cost), family = binomial(), data = nuclearplants) ### For more complicated models, create a distance matrix and pass it to fullmatch. mhd <- match_on(pr ~ t1 + t2, data = nuclearplants) + caliper(match_on(ppty), 2) ( pm2 <- pairmatch(mhd, data = nuclearplants) ) summary(pm2) ### Propensity balance assessment. Requires RItools package. if(require(RItools)) summary(pm2, ppty) ### 1:2 matched triples ( tm <- pairmatch(pr ~ t1 + t2, controls = 2, data = nuclearplants) ) summary(tm) ### Creating a data frame with the matched sets attached. ### match_on(), caliper() and the like cooperate with pairmatch() ### to make sure observations are in the proper order: all.equal(names(tm), row.names(nuclearplants)) ### So our data frame including the matched sets is just cbind(nuclearplants, matches=tm) ### In contrast, if your matching distance is an ordinary matrix ### (as earlier versions of optmatch required), you'll ### have to align it by observation name with your data set. cbind(nuclearplants, matches = tm[row.names(nuclearplants)]) ### Match in subgroups only. There are a few ways to specify this. m1 <- pairmatch(pr ~ t1 + t2, data=nuclearplants, within=exactMatch(pr ~ pt, data=nuclearplants)) m2 <- pairmatch(pr ~ t1 + t2 + strata(pt), data=nuclearplants) ### Matching on propensity scores within matching in subgroups only: m3 <- pairmatch(glm(pr ~ t1 + t2, data=nuclearplants, family=binomial), data=nuclearplants, within=exactMatch(pr ~ pt, data=nuclearplants)) m4 <- pairmatch(glm(pr ~ t1 + t2 + pt, data=nuclearplants, family=binomial), data=nuclearplants, within=exactMatch(pr ~ pt, data=nuclearplants)) m5 <- pairmatch(glm(pr ~ t1 + t2 + strata(pt), data=nuclearplants, family=binomial), data=nuclearplants) # Including `strata(foo)` inside a glm uses `foo` in the model as # well, so here m4 and m5 are equivalent. m3 differs in that it does # not include `pt` in the glm.
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.