matchSchools: Match Schools on Student-based Distance


Takes in a school distance matrix created using information from the first-stage student match and matches schools optimally, potentially


matchSchools(dmat, students, treatment,, school.fb, penalty, verbose, tol)



a distance matrix for schools, with a row for each treated school and a column for each control school.


a dataframe containing student and school covariates, with a different row for each student.


the column name of the binary treatment status indicator in the students dataframe.

the column name of the unique school ID in the students dataframe.


an optional list of character vectors, each containing a subset of the column names of students. Each element of the list should contain all the names in previous elements (producing a nested structure).


a numeric value, treated as the cost to the objective function of excluding a treated school. If it is set lower, more schools will be excluded.


a logical value indicating whether detailed output should be printed.


a numeric tolerance value for comparing distances. It may need to be raised above the default when matching with many levels of refined balance.


The school.fb argument encodes a refined covariate balance constraint: the matching algorithm optimally balances the interaction of the variables in the first list element, then attempts to further balance the interaction in the second element, and so on. As such variables should be added in order of priority for balance.


a dataframe with two columns, one containing treated school IDs and the other containing matched control school IDs.


Luke Keele, Penn State University,

Sam Pimentel, University of Pennsylvania,

Questions? Problems? Suggestions? or email at

All documentation is copyright its authors; we didn't write any of that.