distmat | R Documentation |
Function for building a normalized rank-based Mahalanobis distance matrix with a penalty for caliper violation.
distmat(t_ind, X_mat, calip_cov = NULL, calip_size = NULL, calip_penalty = NULL,
near_exact_covs = NULL, near_exact_penalties = NULL, digits = 1)
t_ind |
treatment indicator: a vector of zeros and ones indicating treatment (1 = treated; 0 = control). |
X_mat |
matrix of covariates: a matrix of covariates used to build the based Mahalanobis distance matrix. |
calip_cov |
caliper covariate: a covariate vector used to define the caliper. In most applications this is the propensity score, but a covariate can be used as well. |
calip_size |
caliper size: a scalar that determines the size of the caliper for which there will be no penalty. Most applications use |
calip_penalty |
a scalar used to multiply the magnitude of the violation of the caliper. |
near_exact_covs |
a matrix of covariates used for near-exact matching. |
near_exact_penalties |
a vector of scalars used for near-exact matching. The length of |
digits |
a scalar indicating the number of digits used to produce each entry of the distance matrix. The default is 1 digit. |
distmat
is a function for building a normalized rank-based Mahalanobis distance matrix with a penalty for caliper violations on a covariate (say, the propensity score) and penalties for near-exact matching.
As explained in Rosenbaum (2010), the use of a rank-based Mahalanobis distance prevents an outlier from inflating the variance for a variable, and it thus decreases its importance in the matching. In the calculation of the matrix the variances are constrained to not decrease as ties become more common, so that it is not more important to match on a rare binary variable than on a common binary one. The penalty for caliper violations ensures good balance on the propensity score or the covariate used. In this way the rank-based Mahalanobis distance with a penalty for caliper violations in the propensity score constitutes a robust distance for matching.
As explained in Zubizarreta et al. (2011), the distance matrix can also be modified for near-exact matching. Penalties are added to the distance matrix every time that a treated and a control unit have a different value for the corresponding near-exact matching covariate.
A matrix that can be used for optimal matching with the bmatch
functions in the designmatch
package.
Rosenbaum, P. R. (2010), Design of Observational Studies, Springer.
Zubizarreta, J. R., Reinke, C. E., Kelz, R. R., Silber, J. H., and Rosenbaum, P. R. (2011), "Matching for Several Sparse Nominal Variables in a Case-Control Study of Readmission Following Surgery," The American Statistician, 65, 229-238.
# Load data
data(germancities)
attach(germancities)
# Treatment indicator
t_ind = treat
# Matrix of covariates
X_mat = cbind(log2pop, popgrowth1939, popgrowth3339, emprate, indrate, rubble,
rubblemiss, flats, flatsmiss, refugees)
# Distance matrix
dist_mat = distmat(t_ind, X_mat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.