# Optimal 1:1 and 1:k matching

### Description

Given a treatment group, a larger control reservoir, and a method for creating discrepancies between each treatment and control unit (or optionally an already created such discrepancy matrix), finds a pairing of treatment units to controls that minimizes the sum of discrepancies.

### Usage

1 2 3 |

### Arguments

`x` |
Any valid input to Alternatively, a precomputed distance may be entered. |

`controls` |
The number of controls to be matched to each treatment |

`data` |
Optional data set. |

`remove.unmatchables` |
Should treatment group members for which there are no eligible controls be removed prior to matching? |

`...` |
Additional arguments to pass to |

### Details

This is a wrapper to `fullmatch`

; see its documentation for more
information, especially on additional arguments to pass, additional discussion
of valid input for parameter `x`

, and feasibility recovery.

If `remove.unmatchables`

is `FALSE`

, then if there are unmatchable
treated units then the matching as a whole will fail and no units will be
matched. If `TRUE`

, then this unit will be removed and the function will
attempt to match each of the other treatment units. (In this case matching
can still fail, if there is too much competition for certain controls; if you
find yourself in that situation you should consider full matching, which
necessarily finds a match for everyone with an eligible match somewhere.)

The units of the `optmatch`

object returned correspond to members of the
treatment and control groups in reference to which the matching problem was
posed, and are named accordingly; the names are taken from the row and column
names of `distance`

(with possible additions from the optional
`data`

argument). Each element of the vector is the concatenation of:
(i) a character abbreviation of `subclass.indices`

, if that argument was
given, or the string '`m`

' if it was not; (ii) the string `.`

; and
(iii) a non-negative integer. Unmatched units have `NA`

entries.
Secondarily, `fullmatch`

returns various data about the matching process
and its result, stored as attributes of the named vector which is its primary
output. In particular, the `exceedances`

attribute gives upper bounds,
not necessarily sharp, for the amount by which the sum of distances between
matched units in the result of `fullmatch`

exceeds the least possible sum
of distances between matched units in a feasible solution to the matching
problem given to `fullmatch`

. (Such a bound is also printed by
`print.optmatch`

and by `summary.optmatch`

.)

### Value

A `optmatch`

object (`factor`

) indicating matched groups.

### References

Hansen, B.B. and Klopfer, S.O. (2006), ‘Optimal full matching
and related designs via network flows’, *Journal of Computational
and Graphical Statistics*, **15**, 609–627.

### See Also

`matched`

, `caliper`

, `fullmatch`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ```
data(nuclearplants)
### Pair matching on a Mahalanobis distance
( pm1 <- pairmatch(pr ~ t1 + t2, data = nuclearplants) )
summary(pm1)
### Pair matching within a propensity score caliper.
ppty <- glm(pr ~ . - (pr + cost), family = binomial(), data = nuclearplants)
### For more complicated models, create a distance matrix and pass it to fullmatch.
mhd <- match_on(pr ~ t1 + t2, data = nuclearplants) + caliper(match_on(ppty), 2)
( pm2 <- pairmatch(mhd, data = nuclearplants) )
summary(pm2)
### Propensity balance assessment. Requires RItools package.
if(require(RItools)) summary(pm2, ppty)
### 1:2 matched triples
( tm <- pairmatch(pr ~ t1 + t2, controls = 2, data = nuclearplants) )
summary(tm)
### Creating a data frame with the matched sets attached.
### match_on(), caliper() and the like cooperate with pairmatch()
### to make sure observations are in the proper order:
all.equal(names(tm), row.names(nuclearplants))
### So our data frame including the matched sets is just
cbind(nuclearplants, matches=tm)
### In contrast, if your matching distance is an ordinary matrix
### (as earlier versions of optmatch required), you'll
### have to align it by observation name with your data set.
cbind(nuclearplants, matches = tm[row.names(nuclearplants)])
### Match in subgroups only. There are a few ways to specify this.
m1 <- pairmatch(pr ~ t1 + t2, data=nuclearplants,
within=exactMatch(pr ~ pt, data=nuclearplants))
m2 <- pairmatch(pr ~ t1 + t2 + strata(pt), data=nuclearplants)
### Matching on propensity scores within matching in subgroups only:
m3 <- pairmatch(glm(pr ~ t1 + t2, data=nuclearplants, family=binomial),
data=nuclearplants,
within=exactMatch(pr ~ pt, data=nuclearplants))
m4 <- pairmatch(glm(pr ~ t1 + t2 + pt, data=nuclearplants,
family=binomial),
data=nuclearplants,
within=exactMatch(pr ~ pt, data=nuclearplants))
m5 <- pairmatch(glm(pr ~ t1 + t2 + strata(pt), data=nuclearplants,
family=binomial), data=nuclearplants)
# Including `strata(foo)` inside a glm uses `foo` in the model as
# well, so here m4 and m5 are equivalent. m3 differs in that it does
# not include `pt` in the glm.
``` |