matchMulti: A function that performs multilevel matching.

View source: R/matchMulti.R

matchMultiR Documentation

A function that performs multilevel matching.

Description

This is the workhorse function in the package which matches groups and units within groups. For example, it will match both schools and students in schools, where the goal is to make units more comparable to estimate treatment effects.

Usage

matchMulti(
  data,
  treatment,
  school.id,
  match.students = TRUE,
  student.vars = NULL,
  school.caliper = NULL,
  school.fb = NULL,
  verbose = FALSE,
  keep.target = NULL,
  student.penalty.qtile = 0.05,
  min.keep.pctg = 0.8,
  school.penalty = NULL,
  save.first.stage = TRUE,
  tol = 10,
  solver = "rlemon"
)

Arguments

data

A data frame for use in matching.

treatment

Name of covariate that defines treated and control groups.

school.id

Identifier for groups (for example schools)

match.students

Logical flag for whether units within groups should also be matched. If set to FALSE, all units will be retained in both groups.

student.vars

Names of student level covariates on which to measure balance. School-level distances will be penalized when student mathces are imbalanced on these variables. In addition, when match.students is TRUE, students are matched on a distance computed from these covariates.

school.caliper

matrix with one row for each treated school and one column for each control school, containing zeroes for pairings allowed by the caliper and Inf values for forbidden pairings. When NULL no caliper is imposed.

school.fb

A list of discrete group-level covariates on which to enforce fine balance, i.e., ensure marginal distributions are balanced. First group is most important, second is second most, etc. If a simple list of variable names, one group is assumed. A list of list will give this hierarchy.

verbose

Logical flag for whether to give detailed output.

keep.target

an optional numeric value specifying the number of treated schools desired in the final match.

student.penalty.qtile

This helps exclude students if they are difficult to match. Default is 0.05, which implies that in the match we would prefer to exclude students rather than match them at distances larger than this quantile of the overall student-student robust Mahalanobis distance distribution

min.keep.pctg

Minimum percentage of students (from smaller school) to keep when matching students in each school pair.

school.penalty

A penalty to remove groups (schools) in the group (school) match

save.first.stage

Should first stage matches be saved.

tol

a numeric tolerance value for comparing distances, used in the school match. It may need to be raised above the default when matching with many levels of refined balance or in very large problems (when these distances will often be at least on the order of the tens of thousands).

solver

Name of package used to solve underlying network flow problem for the school match, one of 'rlemon' and 'rrelaxiv'. rrelaxiv carries an academic license and is not hosted on CRAN so it must be installed separately.

Details

matchMulti first matches students (or other individual units) within each pairwise combination of schools (or other groups); based on these matches a distance matrix is generated for the schools. Then schools are matched on this distance matrix and the student matches for the selected school pairs are combined into a single matched sample.

School covariates are not used to compute the distance matrix for schools (since it is generated from the student match). Instead imbalances in school covariates should be addressed through theschool.fb argument, which encodes a refined covariate balance constraint. School covariates in school.fb should be given in order of priority for balance, since the matching algorithm optimally balances the variables in the first list element, then attempts to further balance the those in the second element, and so on.

Value

raw

The unmatched data before matching.

matched

The matched dataset of both units and groups. Outcome analysis and balance checks are peformed on this item.

school.match

Object with two parts. The first lists which treated groups (schools) are matched to which control groups. The second lists the population of groups used in the match.

school.id

Name of school identifier

treatment

Name of treatment variable

Author(s)

Luke Keele, Penn State University, ljk20@psu.edu

Sam Pimentel, University of Pennsylvania, spi@wharton.upenn.edu

See Also

See also matchMulti, matchMultisens, balanceMulti, matchMultioutcome, rematchSchools

Examples



#toy example with short runtime
library(matchMulti)

#Load Catholic school data
data(catholic_schools)

# Trim data to speed up example
catholic_schools <- catholic_schools[catholic_schools$female_mean >.45 &
 catholic_schools$female_mean < .60,]

#match on a single covariate
student.cov <- c('minority')

 match.simple <- 
matchMulti(catholic_schools, treatment = 'sector',
                             school.id = 'school', match.students = FALSE,
                             student.vars = student.cov, verbose=TRUE, tol=.01)

#Check balance after matching - this checks both student and school balance
  balanceMulti(match.simple, student.cov = student.cov)


## Not run: 
#larger example
data(catholic_schools)

student.cov <- c('minority','female','ses')

# Check balance student balance before matching
balanceTable(catholic_schools[c(student.cov,'sector')],  treatment = 'sector')

#Match schools but not students within schools
match.simple <- matchMulti(catholic_schools, treatment = 'sector',
school.id = 'school', match.students = FALSE)

#Check balance after matching - this checks both student and school balance
balanceMulti(match.simple, student.cov = student.cov)

#Estimate treatment effect
output <- matchMultioutcome(match.simple, out.name = "mathach",
schl_id_name = "school",  treat.name = "sector")

# Perform sensitivity analysis using Rosenbaum bound -- increase Gamma to increase effect of
# possible hidden confounder
matchMultisens(match.simple, out.name = "mathach",
          schl_id_name = "school",
          treat.name = "sector", Gamma = 1.3)


# Now match both schools and students within schools
match.out <- matchMulti(catholic_schools, treatment = 'sector',
school.id = 'school', match.students = TRUE, student.vars = student.cov)

# Check balance again
bal.tab <- balanceMulti(match.out, student.cov = student.cov)

# Now match with fine balance constraints on whether the school is large
# or has a high percentage of minority students
match.fb <- matchMulti(catholic_schools, treatment = 'sector', school.id = 'school',
match.students = TRUE, student.vars = student.cov,
school.fb = list( c('size_large'), c('minority_mean_large') )

# Estimate treatment effects
matchMultioutcome(match.fb, out.name = "mathach", schl_id_name = "school",  treat.name = "sector")

#Check Balance
balanceMulti(match.fb, student.cov = student.cov)


## End(Not run)



matchMulti documentation built on May 31, 2023, 9:13 p.m.