fullmatch | R Documentation |
Given two groups, such as a treatment and a control group, and a method of creating a treatment-by-control discrepancy matrix indicating desirability and permissibility of potential matches (or optionally an already created such discrepancy matrix), create optimal full matches of members of the groups. Optionally, incorporate restrictions on matched sets' ratios of treatment to control units.
fullmatch(
x,
min.controls = 0,
max.controls = Inf,
omit.fraction = NULL,
mean.controls = NULL,
tol = 0.001,
data = NULL,
solver = "",
...
)
full(
x,
min.controls = 0,
max.controls = Inf,
omit.fraction = NULL,
mean.controls = NULL,
tol = 0.001,
data = NULL,
solver = "",
...
)
x |
Any valid input to If Alternatively, a precomputed distance may be entered. A matrix of
non-negative discrepancies, each indicating the permissibility and
desirability of matching the unit corresponding to its row (a 'treatment') to
the unit corresponding to its column (a 'control'); or, better, a distance
specification as produced by |
min.controls |
The minimum ratio of controls to treatments that is to
be permitted within a matched set: should be non-negative and finite. If
When matching within subclasses (such as those created by
|
max.controls |
The maximum ratio of controls to treatments that is
to be permitted within a matched set: should be positive and numeric.
If When matching within subclasses (such as those created by
|
omit.fraction |
Optionally, specify what fraction of controls or treated
subjects are to be rejected. If When matching within subclasses (such as those created by
At most one of |
mean.controls |
Optionally, specify the average number of controls per
treatment to be matched. Must be no less than than When matching within subclasses (such as those created by
At most one of |
tol |
Because of internal rounding, |
data |
Optional |
solver |
Choose which solver to use. Currently implemented are RELAX-IV
and LEMON. Default of To explicitly use RELAX-IV, pass string "RELAX-IV". To use LEMON, pass string "LEMON". Optionally, to specify which algorithm LEMON will use, pass the function LEMON with argument for the algorithm name, "CycleCancelling", "CapacityScaling", "CostScaling", and "NetworkSimplex". See this site for details on their differences: https://lemon.cs.elte.hu/pub/doc/latest/a00606.html. CycleCancelling is the default. The CycleCancelling algorithm seems to produce results most closely
resembling those of optmatch versions prior to 1.0. We have observed the
other LEMON algorithms to produce different results when the
|
... |
Additional arguments, passed to |
If passing an already created discrepancy matrix, finite entries indicate permissible matches, with smaller discrepancies indicating more desirable matches. The matrix must have row and column names.
If it is desirable to create the discrepancies matrix beforehand (for example,
if planning on running several different matching schemes), consider using
match_on
to generate the distances. This generic function has
several useful methods for handling propensity score models, computing
Mahalanobis distances (and other arbitrary distances), and using user supplied
functions. These distances can also be combined with those generated by
exactMatch
and caliper
to create very nuanced
matching specifications.
The value of tol
can have a substantial effect on computation time;
with smaller values, computation takes longer. Not every tolerance can be
met, and how small a tolerance is too small varies with the machine and with
the details of the problem. If fullmatch
can't guarantee that the
tolerance is as small as the given value of argument tol
, then
matching proceeds but a warning is issued.
By default, fullmatch
will attempt, if the given constraints are
infeasible, to find a feasible problem using the same constraints. This
will almost surely involve using a more restrictive omit.fraction
or
mean.controls
. (This will never automatically omit treatment units.)
Note that this does not guarantee that the returned match has the least
possible number of omitted subjects, it only gives a match that is feasible
within the given constraints. It may often be possible to loosen the
omit.fraction
or mean.controls
constraint and still find a
feasible match. The auto recovery is controlled by
options("fullmatch_try_recovery")
.
In full matching problems permitting many-one matches (min.controls
less than 1), the number of controls contributing to matches can exceed
what was requested by setting a value of mean.controls
or
omit.fraction
. I.e., in this setting mean.controls
sets
the minimum ratio of number of controls to number of treatments placed
into matched sets.
If the program detects that (what it thinks is) a large problem,
a warning is issued. Unless you have an older computer, there's a good
chance that you can handle larger problems (at the cost of increased
computation time). To check the large problem threshold, use
getMaxProblemSize
; to re-set it, use
setMaxProblemSize
.
A optmatch
object (factor
) indicating matched groups.
Hansen, B.B. and Klopfer, S.O. (2006), ‘ Optimal full matching and related designs via network flows’, Journal of Computational and Graphical Statistics, 15, 609–627.
Hansen, B.B. (2004), ‘Full Matching in an Observational Study of Coaching for the SAT’, Journal of the American Statistical Association, 99, 609–618.
Rosenbaum, P. (1991), ‘A Characterization of Optimal Designs for Observational Studies’, Journal of the Royal Statistical Society, Series B, 53, 597–610.
data(nuclearplants)
### Full matching on a Mahalanobis distance.
( fm1 <- fullmatch(pr ~ t1 + t2, data = nuclearplants) )
summary(fm1)
### Full matching with restrictions.
( fm2 <- fullmatch(pr ~ t1 + t2, min.controls = .5, max.controls = 4, data = nuclearplants) )
summary(fm2)
### Full matching to half of available controls.
( fm3 <- fullmatch(pr ~ t1 + t2, omit.fraction = .5, data = nuclearplants) )
summary(fm3)
### Full matching attempts recovery when the initial restrictions are infeasible.
### Limiting max.controls = 1 allows use of only 10 of 22 controls.
( fm4 <- fullmatch(pr ~ t1 + t2, max.controls = 1, data=nuclearplants) )
summary(fm4)
### To recover restrictions
optmatch_restrictions(fm4)
### Full matching within a propensity score caliper.
ppty <- glm(pr ~ . - (pr + cost), family = binomial(), data = nuclearplants)
### Note that units without counterparts within the caliper are automatically dropped.
### For more complicated models, create a distance matrix and pass it to fullmatch.
mhd <- match_on(pr ~ t1 + t2, data = nuclearplants) + caliper(match_on(ppty), width = 1)
( fm5 <- fullmatch(mhd, data = nuclearplants) )
summary(fm5)
### Propensity balance assessment. Requires RItools package.
if (require(RItools)) summary(fm5,ppty)
### The order of the names in the match factor is the same
### as the nuclearplants data.frame since we used the data argument
### when calling fullmatch. The order would be unspecified otherwise.
cbind(nuclearplants, matches = fm5)
### Match in subgroups only. There are a few ways to specify this.
m1 <- fullmatch(pr ~ t1 + t2, data=nuclearplants,
within=exactMatch(pr ~ pt, data=nuclearplants))
m2 <- fullmatch(pr ~ t1 + t2 + strata(pt), data=nuclearplants)
### Matching on propensity scores within matching in subgroups only:
m3 <- fullmatch(glm(pr ~ t1 + t2, data=nuclearplants, family=binomial),
data=nuclearplants,
within=exactMatch(pr ~ pt, data=nuclearplants))
m4 <- fullmatch(glm(pr ~ t1 + t2 + pt, data=nuclearplants,
family=binomial),
data=nuclearplants,
within=exactMatch(pr ~ pt, data=nuclearplants))
m5 <- fullmatch(glm(pr ~ t1 + t2 + strata(pt), data=nuclearplants,
family=binomial), data=nuclearplants)
# Including `strata(foo)` inside a glm uses `foo` in the model as
# well, so here m4 and m5 are equivalent. m3 differs in that it does
# not include `pt` in the glm.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.