imbalance: Calculates several imbalance measures

View source: R/imb.meas.R

imbalanceR Documentation

Calculates several imbalance measures


Calculates several imbalance measures for the original and matched data sets


imbalance(group, data, drop=NULL, breaks = NULL, weights, grouping = NULL)



the group variable


the data


a vector of variable names in the data frame to ignore


a list of vectors of cutpoints used to calculate the L1 measure. See Details.




named list, each element of which is a list of groupings for a single categorical variable. See Details.


This function calculates several imbalance measures. For numeric variables, the difference in means (under the column statistic), the difference in quantiles and the L1 measure is calculated. For categorical variables the L1 measure and the Chi-squared distance (under column statistic) is calculated. Column type reports either (diff) or (Chi2) to indicate the type of statistic being calculated.

If breaks is not specified, the Scott automated bin calculation is used (which coarsens less than Sturges, which used in cem). Please refer to cem help page. In this case, breaks are used to calculate the L1 measure.

This function also calculate the global L1 imbalance measure. If breaks is missing, the default rule to calculate cutpoints is the Scott's rule.

The grouping option is a list where each element is itself a list. For example, suppose for variable quest1 you have the following possible levels "no answer", NA, "negative", "neutral", "positive" and you want to collect ("no answer", NA, "neutral") into a single group, then the grouping argument should contain list(quest1=list(c("no answer", NA, "neutral"))). Or if you have a discrete variable elements with values 1:10 and you want to collect it into groups “1:3,NA”, “4”, “5:9”, “10” you specify in grouping the following list list(elements=list(c(1:3,NA), 5:9)). Values not defined in the grouping are left as they are. If cutpoints and groupings are defined for the same variable, the groupings take precedence and the corresponding cutpoints are set to NULL.

See L1.meas help page for details.


An object of class imbalance which is a list with the following two elements


Table of imbalance measures


The global L1 measure of imbalance


Stefano Iacus, Gary King, and Giuseppe Porro


Iacus, King, Porro (2011) doi: 10.1198/jasa.2011.tm09599

Iacus, King, Porro (2012) doi: 10.1093/pan/mpr013

Iacus, King, Porro (2019) doi: 10.1017/pan.2018.29




todrop <- c("treated","re78")
imbalance(LL$treated, LL, drop=todrop)

# cem match: automatic bin choice
mat <- cem(treatment="treated", data=LL, drop="re78")

cem documentation built on Sept. 8, 2022, 5:09 p.m.