README.md

DIFreport

DIFreport is currently under construction. All information here is provisional and subject to change.

Summary

Development of DIFreport was motivated by questions regarding the psychometric properties of early childhood development assessments used as outcome measures in impact evaluations. This package provides methods for (a) evaluating differential item functioning (DIF), (b) checking whether DIF leads to biased standardized mean difference estimates on the unit-weighted total score or item response theory (IRT) based scoring methods, and (c) providing DIF-corrected estimates by either removing biased items or using an IRT models that adjust for DIF.

Methods

The methods for evaluating DIF currently include semi-parametric regression (LOESS), the Mantel-Haenszel test, logistic regression, and multigroup IRT.

When evaluating the impact of DIF, standardized mean differences can be estimated with respect to the primary grouping variable (e.g., gender, age, and socio-economic status) as well as with respect to a conditioning variable (e.g., intervention condition) The DIFreport documentation refers to these as unconditional and conditional standardized mean differneces, respectively.

Workflows

DIFreport provides two general workflows, both of which are illustrated below. In the “basic” workflow, summary_report() is an omnibus function that automates the DIF analysis and, if there is any DIF detected, computes the standardized mean differences with and without adjusting for DIF. The results are summarized in an rmarkdown produced report.

In the “advanced” workflow, the steps automated by summary_report() can be implemented directly, which gives more control over which DIF analyses are conducted and which items are identified as biased.

The both workflows are currently under construction, but particularly the advanced workflow as the output returned by some functions are not very user-friendly. These outputs may be converted to S3 or S4 objects with methods (e.g., print, summary) to provide greater utility.

DIFreport is in the alpha stage of development. Please report any bugs or suggestions by opening a Github issue.

Installation

install.packages("remotes")
remotes::install_github("knickodem/DIFreport")
library(DIFreport)

Workflows

Dataset

Neuman, M., Ozler, B., & Fernald, L. (2013). Protecting early childhood development in Malawi impact evaluation survey 2013, midline. World Bank. https://doi.org/10.48529/94zr-ww41

Basic

The basic workflow requires only two functions:

Unconditional Standardized Mean Differences

If assessing DIF with respect to only the grouping variable of interest, and evaluating robustness of the unconditional standardized mean differences, only dif.group need be supplied to dif_prep(). If cond.group is supplied, then conditional standardized mean differences are estimated (see Conditional Standardized Mean Differences). The function summary_report() recognizes how the data are prepared and estimates the appropriate standardized mean differences.

The code below uses the built-in dataset “mdat” to generate this report.

# Load the example data (MDAT language assessment, collected in Malawi). 
data("mdat")

mdat_tx <- dif_prep(item.data = mdat[5:ncol(mdat)],
                               dif.groups = mdat$treated,
                               cluster = mdat$cluster,
                               ref.name = "Male", # "Male" is a value in mdat$gender
                               na.to.0 = TRUE)

summary_report(dif.data = mdat_tx,
               file.name = "DIF-Effects-Tx-MDAT-Language",
               report.type = "dif.effects",
               report.title = "MDAT Language: DIF by Treatment Condition",
               measure.name = "MDAT Language",
               dataset.name = "Malawi")

Conditional Standardized Mean Differences

If assessing DIF with respect to a conditioning variable (e.g., treatment group) and evaluating robustness of the conditional standardized mean differences (e.g., for just girls, for just boys, and their difference) is of interest, then both dif.groups and cond.groupsneed to be supplied to dif_prep().

The code below generates this report.

mdat_gender <- dif_prep(item.data = mdat[5:ncol(mdat)],
                             dif.groups = mdat$gender,
                             cond.groups = mdat$treated,
                             cluster = mdat$cluster,
                             ref.name = "Male") 

summary_report(dif.data = mdat_gender,
               file.name = "DIF-Effects-Gender-MDAT-Language",
               report.type = "dif.effects",
               report.title = "MDAT Language: DIF by Gender",
               measure.name = "MDAT Language",
               dataset.name = "Malawi")

Advanced

The advanced workflow shows what is going on under the hood of summary_report() and lets you have more control over which items are identified as biased in the report. In this section we work through the conditional standardized mean difference example in more detail.

The first step is still dif_prep(), which operates in the same manner as the Basic workflow. This step is repeated below.

library(DIFreport)
data("mdat")

mdat_gender <- dif_prep(item.data = mdat[5:ncol(mdat)],
                             dif.groups = mdat$gender,
                             cond.groups = mdat$treated,
                             cluster = mdat$cluster,
                             ref.name = "Male") 

DIF Analysis

The dif_analysis() function explores DIF with respect to the dif.groups using various methods. The goal is to arrive at an overall picture of the data that does not depend on the assumptions of any one particular method. Currently, 4 DIF methods are available for dichotomous data, but only semi-parametric regression and IRT can be used with multi-category (“polytomous”) assessment items.

Each method has its own low-level function (dif_loess(), dif_mh(), dif_logistic(), and dif_irt()), which are called by dif_analysis().

dif.analysis <- dif_analysis(dif.data = mdat_gender,
                            dif.methods =  c("loess", "MH", "logistic", "IRT"),
                            match.type = "Total",
                            match.bins = seq(0, 1, by = .1)) # use deciles of matching variable in MH
# View the biased items reported by "MH", "logistic", or "IRT" methods
options(knitr.kable.NA = '')
knitr::kable(biased_items_table(dif.analysis))

If the standardized mean difference estimates are not of interest, a report of the DIF analysis can be generated dif_report(). The report enables comparison of the DIF methods with one method highlighted via the biased.items argument (example report).

dif_report(dif.analysis = dif.analysis,
           file.name = "DIF-Only-Gender-MDAT-Language",
           report.title = "Gender DIF in MDAT Language",
           measure.name = "MDAT Language",
           dataset.name = "Malawi",
           biased.items = "IRT")

DIF Impact

The last step in this workflow is to estimate standardized mean differences, with and without adjustments for DIF. standardized mean differences and their standard errors are computed using the method described by Hedges (2007), which is applicable to cluster-randomized designs as well as simple random samples.

Standardized Mean Differences and their standard errors are reported for four different outcome variables.

The information supplied to dif_prep() (which is passed to dif_impact() provides additional details about how the standardized mean differences are computed

A report with just the robustness of standardized mean difference estimates to biased items, but no DIF analysis information, can be requested from effect_report(). More often, the DIF analysis results are also of interest, in which case use dif_effect_report() to include summaries of both the DIF analysis and the standardized mean difference robustness checks, such as the report generated here

dif_effect_report(dif.analysis = dif.analysis,
           dif.models = dif.models,
           effect.robustness = effect.robustness,
           file.name = "Logistic-Gender-MDAT-Language",
           report.title = "Gender DIF in MDAT Language",
           measure.name = "MDAT Language",
           dataset.name = "Malawi",
           biased.items = "logistic")

References

Hedges, L. V. (2007). Effect Sizes in Cluster-Randomized Designs. Journal of Educational and Behavioral Statistics, 32, 341–370. https://doi.org/10.3102/1076998606298043.



knickodem/WBdif documentation built on Feb. 3, 2024, 2:20 a.m.