Matchby  R Documentation 
This function is a wrapper for the Match
function which
separates the matching problem into subgroups defined by a factor.
This is equivalent to conducting exact matching on each level of a factor.
Matches within each level are found as determined by the
usual matching options. This function is much faster for large
datasets than the Match
function itself. For additional
speed, consider doing matching without replacement—see the
replace
option. This function is more limited than the
Match
function. For example, Matchby
cannot be
used if the user wishes to provide observation specific weights.
Matchby(Y, Tr, X, by, estimand = "ATT", M = 1, ties=FALSE, replace=TRUE, exact = NULL, caliper = NULL, AI=FALSE, Var.calc=0, Weight = 1, Weight.matrix = NULL, distance.tolerance = 1e05, tolerance = sqrt(.Machine$double.eps), print.level=1, version="Matchby", ...)
Y 
A vector containing the outcome of interest. Missing values are not allowed. 
Tr 
A vector indicating the observations which are in the treatment regime and those which are not. This can either be a logical vector or a real vector where 0 denotes control and 1 denotes treatment. 
X 
A matrix containing the variables we wish to match on. This matrix may contain the actual observed covariates or the propensity score or a combination of both. 
by 
A "factor" in the sense that 
estimand 
A character string for the estimand. The default estimand is "ATT", the sample average treatment effect for the treated. "ATE" is the sample average treatment effect (for all), and "ATC" is the sample average treatment effect for the controls. 
M 
A scalar for the number of matches which should be
found. The default is onetoone matching. Also see the

ties 
A logical flag for whether ties should be handled
deterministically. By default 
replace 
Whether matching should be done with replacement. Note
that if 
exact 
A logical scalar or vector for whether exact matching
should be done. If a logical scalar is provided, that logical value is
applied to all covariates of

caliper 
A scalar or vector denoting the caliper(s) which
should be used when matching. A caliper is the distance which is
acceptable for any match. Observations which are outside of the
caliper are dropped. If a scalar caliper is provided, this caliper is
used for all covariates in 
AI 
A logical flag for if the AbadieImbens standard error
should be calculated. It is computationally expensive to calculate
with large datasets. 
Var.calc 
A scalar for the variance estimate
that should be used. By default 
Weight 
A scalar for the type of
weighting scheme the matching algorithm should use when weighting
each of the covariates in 
Weight.matrix 
This matrix denotes the weights the matching
algorithm uses when weighting each of the covariates in For most uses, this matrix has zeros in the offdiagonal
cells. This matrix can be used to weight some variables more than
others. For
example, if 
distance.tolerance 
This is a scalar which is used to determine if distances
between two observations are different from zero. Values less than

tolerance 
This is a scalar which is used to determine numerical tolerances. This option is used by numerical routines such as those used to determine if a matrix is singular. 
print.level 
The level of printing. Set to '0' to turn off printing. 
version 
The version of the code to be used. The "Matchby" C/C++ version of the code is the fastest, and the enduser should not change this option. 
... 
Additional arguments passed on to 
Matchby
is much faster for large datasets than
Match
. But Matchby
only implements a subset of
the functionality of Match
. For example, the
restrict
option cannot be used, AbadieImbens standard errors
are not provided and bias adjustment cannot be requested.
Matchby
is a wrapper for the Match
function which
separates the matching problem into subgroups defined by a factor. This
is the equivalent to doing exact matching on each factor, and the
way in which matches are found within each factor is determined by the
usual matching options.
Note that by default ties=FALSE
although the default for
the Match
in GenMatch
functions is TRUE
. This is
done because randomly breaking ties in large datasets often results in
a great speedup. For additional speed, consider doing matching
without replacement which is often much faster when the dataset is
large—see the replace
option.
There will be slight differences in the matches produced by
Matchby
and Match
because of how the covariates
are weighted. When the data is broken up into separate groups (via
the by
option), Mahalanobis distance and inverse variance
will imply different weights than when the data is taken as whole.
est 
The estimated average causal effect. 
se.standard 
The usual standard error. This is the standard error calculated on the matched data using the usual method of calculating the difference of means (between treated and control) weighted so that ties are taken into account. 
se 
The AbadieImbens standard error. This is only calculated
if the 
index.treated 
A vector containing the observation numbers from
the original dataset for the treated observations in the
matched dataset. This index in conjunction with 
index.control 
A vector containing the observation numbers from
the original data for the control observations in the
matched data. This index in conjunction with 
weights 
The weights for each observation in the matched dataset. 
orig.nobs 
The original number of observations in the dataset. 
nobs 
The number of observations in the matched dataset. 
wnobs 
The number of weighted observations in the matched dataset. 
orig.treated.nobs 
The original number of treated observations. 
ndrops 
The number of matches which were dropped because there were not enough observations in a given group and because of caliper and exact matching. 
estimand 
The estimand which was estimated. 
version 
The version of 
Jasjeet S. Sekhon, UC Berkeley, sekhon@berkeley.edu, http://sekhon.berkeley.edu/.
Sekhon, Jasjeet S. 2011. "Multivariate and Propensity Score Matching Software with Automated Balance Optimization.” Journal of Statistical Software 42(7): 152. doi: 10.18637/jss.v042.i07
Diamond, Alexis and Jasjeet S. Sekhon. 2013. "Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies.” Review of Economics and Statistics. 95 (3): 932–945. http://sekhon.berkeley.edu/papers/GenMatch.pdf
Abadie, Alberto and Guido Imbens. 2006. “Large Sample Properties of Matching Estimators for Average Treatment Effects.” Econometrica 74(1): 235267.
Imbens, Guido. 2004. Matching Software for Matlab and Stata.
Also see Match
,
summary.Matchby
,
GenMatch
,
MatchBalance
,
balanceUV
,
qqstats
, ks.boot
,
GerberGreenImai
, lalonde
# # Match exactly by racial groups and then match using the propensity score within racial groups # data(lalonde) # # Estimate the Propensity Score # glm1 < glm(treat~age + I(age^2) + educ + I(educ^2) + hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) + u74 + u75, family=binomial, data=lalonde) #save data objects # X < glm1$fitted Y < lalonde$re78 Tr < lalonde$treat # onetoone matching with replacement (the "M=1" option) after exactly # matching on race using the 'by' option. Estimating the treatment # effect on the treated (the "estimand" option defaults to ATT). rr < Matchby(Y=Y, Tr=Tr, X=X, by=lalonde$black, M=1); summary(rr) # Let's check the covariate balance # 'nboots' is set to small values in the interest of speed. # Please increase to at least 500 each for publication quality pvalues. mb < MatchBalance(treat~age + I(age^2) + educ + I(educ^2) + black + hisp + married + nodegr + re74 + I(re74^2) + re75 + I(re75^2) + u74 + u75, data=lalonde, match.out=rr, nboots=10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.