healthequal: Calculating summary measures of inequality

knitr::opts_chunk$set(echo = TRUE,
                      results = TRUE,
                      warning = FALSE,
                      message = FALSE,
                      comment = "",
                      collapse = FALSE,
                      class.source = "bg-success",


Measuring and monitoring inequalities in health is important for informing policies and programs that aim to tackle health inequities. Broadly defined, inequalities in health are measurable differences in health across population subgroups defined by dimensions of inequality (demographic, socioeconomic or geographic characteristics). Summary measures of inequality summarise the amount of inequality across subgroups in a single number -- which facilitates the comparison of inequalities over time and across different settings and indicators.

Summary measures of health inequality use either disaggregated data or individual-level data as inputs.

About summary measures of health inequality

Simple summary measures (difference and ratio) compare two population subgroups. They can be calculated for all dimensions of inequality (with two subgroups or more). Complex measures are calculated for inequality dimensions with more than two population subgroups and consider the situation in all subgroups. They can only be calculated for dimensions with more than two subgroups.

Selecting appropriate measures for analysing and reporting inequality involves considering several methodological issues. There are considerations relating to the characteristics of the underlying data, which determines the types of measures that can be calculated and how they are calculated:

There are also considerations relating to the properties of the different measures and the desired purpose of the analysis:

The paper Summary measures of health inequality: a review of existing measures and their application provides further information about these considerations.

Summary measures of health inequality

The following summary measures of health inequality are included in the healthequal library:

Loading the healthequal library

First, load the healthequal library in your R session. This requires the dplyr package.


Read and query the data included in the healthequal package

The healthequal package comes with sample data for users to be able to test the package functions. The OrderedSample and NonorderedSample data contain data disaggregated by economic status and subnational region, respectively, for a single indicator.


The OrderedSampleMultipleind and OrderedSampleMultipleind data contain disaggregated data by economic status and subnational region, respectively, for two indicators.


For information about the datasets, type the following commands, which will display the corresponding dataset help file:



Calculate the absolute concentration index (ACI)

The Absolute Concentration Index (ACI) is a summary measure of health inequality that can be used with ordered dimensions. For information about the ACI function type the following command, which will display the corresponding help file:


The OrderedSample dataset can be used to calculate ACI. Two arguments are required: est (the subgroup estimate, recorded as estimate in the same dataset), and subgroup_order (the order of subgroups in an increasing sequence). Other arguments, such as pop (the number of people within each subgroup, recorded as population in the sample dataset) or weight (the sampling weight for survey data), are optional. Lastly, the force argument can be used to estimate ACI when some estimates are missing.

     aci(est = estimate,
         subgroup_order = subgroup_order,
         pop = population

Calculate the slope index of inequality (SII)

The Slope Index of Inequality (SII) is a summary measure of health inequality that can be used with ordered dimensions. The slope index of inequality (SII) is an absolute measure of inequality that represents the difference in estimated indicator values between the most-advantaged and most-disadvantaged, while taking into consideration the situation in all other subgroups/individuals -- using an appropriate regression model.

For information about the SII function type the following command, which will display the corresponding help file:


SII can be calculated using disaggregated or individual data. In this example, the IndividualSample dataset, a survey weighted dataset, is used. For this type of data, five arguments are required: est (the individual estimate, recorded as sba in the same dataset), subgroup_order (the order of subgroups in an increasing sequence), weight (the sampling weight), psu (the primary sampling unit) and strata (the variable identifying the strata).

     aci(est = sba,
         subgroup_order = subgroup_order,
         weight = weight,
         psu = psu,
         strata = strata

Calculate between-group variance (BGV)

Between-group variance (BGV) is a summary measure of health inequality that it can be used to measure inequality across non-ordered dimensions of inequality. It is calculated as the weighted average of squared differences between subgroup estimates and the weighted mean. Type ?bgv to view the corresponding help file.

The NonorderedSample dataset can be used to calculate BGV, which requires two arguments: pop (the number of people within each subgroup, recorded as population in the sample dataset), and est (the subgroup estimate, recorded as estimate in the same dataset). The argument se (the standard error of the subgroup estimate) is required only to compute the corresponding 95% confidence intervals.

     bgv(pop = population,
         est = estimate,
         se = se

Calculate multiple measures of inequality for a dataset with multiple indicators

The previous examples showed the calculation of a single measure of inequality for a single indicator-dimension combination. These next examples use the dataset NonorderedSampleMultipleind, which contains disaggregated data by subnational region for two indicators:

The data can be inspected as follows:


The Coefficient of Variation (COV) is a summary measure of health inequality that it can be used to measure inequality across non-ordered dimensions of inequality. COV is a relative measure of inequality that considers all population subgroups. Type ?covar to view the corresponding help file. The NonorderedSampleMultipleind dataset can be used to calculate COV for two different dimensions.

measures <- NonorderedSampleMultipleind %>%
  dplyr::group_by(indicator) %>%
  dplyr::summarize(covar(pop = population,
                   est = estimate,
                   scaleval = indicator_scale


The NonorderedSampleMultipleind dataset can also be used to calculate two or more different summary measures (in the example below COV and BGV) for multiple dimensions.

multiplemeasures <- NonorderedSampleMultipleind %>%
                  dimension) %>%
    covar = covar(pop = population,
                        est = estimate,
                        scaleval = indicator_scale),
    bgv = bgv(pop = population,
                    est = estimate,
                    se = se   


# It is possible to extract the measures separetly

Further References

