ILHTEdif: Detect DIF via the IL-HTE mixed model
In difR: Collection of Methods to Detect Dichotomous, Polytomous, and Continuous Differential Item Functioning (DIF)

ILHTEdif

R Documentation

Detect DIF via the IL-HTE mixed model

Description

Implements the Modeling item-level heterogeneous treatment effects (IL-HTE) mixed model for differential item functioning (DIF) with optional total- or rest-score purification. The model is

\operatorname{logit}\{P(Y_{ij}=1)\} = \theta_j + b_i + \zeta_i T_j,

with an item location term

\theta_j = \beta_0 + \beta_1 T_j + \varepsilon_j,

and subject-specific random effects (b_i, \zeta_i) jointly normal. Here T_j is an indicator of group membership for the item-specific effect.

Usage

ILHTEdif(resp_mat, group, subject_ids = NULL, alpha = 0.05,
         nAGQ = 1, purify = FALSE,
         match = c("none", "total", "restscore"),
         maxIter = 2)

Arguments

`resp_mat`	A numeric `matrix` or `data.frame` of binary responses (0/1), rows = subjects, columns = items.
`group`	A vector of length `nrow(resp_mat)` indicating group membership (factor with two levels or numeric 0/1; the second level is treated as the focal group).
`subject_ids`	Optional vector of subject IDs (length `nrow(resp_mat)`); defaults to `1:nrow(resp_mat)`.
`alpha`	Numeric in `(0,1)`. Two-sided significance level used to form the decision threshold `\pm z_{1-\alpha/2}\,\mathrm{SD}(\zeta)`.
`nAGQ`	Integer. Number of adaptive Gauss–Hermite quadrature points passed to `glmer`. `1` is typically accurate; `0` (Laplace) is faster.
`purify`	Logical. If `TRUE`, perform iterative purification up to `maxIter` by recomputing the matching score after removing flagged items.
`match`	Character. Matching method: `"none"` (no matching covariate), `"total"` (total score over all items), or `"restscore"` (total excluding currently flagged items in later purification iterations).
`maxIter`	Integer. Maximum number of purification iterations (default `2`).

Details

Let Y_{ij}\in\{0,1\} be the response of subject i to item j. The proposed IL-HTE model is fitted via a generalized linear mixed model (GLMM):

\operatorname{logit}\{P(Y_{ij}=1)\} = \theta_j + b_i + \zeta_i T_j + \gamma S_i,

where b_i is a subject intercept, \zeta_i a subject-specific group slope (random effect), T_j encodes the focal-vs-reference group effect at the item level, and S_i is an optional matching score (total or rest-score). The item location \theta_j is modeled as

\theta_j = \beta_0 + \beta_1 T_j + \varepsilon_j

with \varepsilon_j random across items. Random effects (b_i, \zeta_i) are assumed jointly normal with unstructured covariance.

Iterative purification, when enabled, proceeds by (i) fitting the GLMM, (ii) flagging items with |\hat{\zeta}_j| > \mathrm{crit} where crit = qnorm(1 - alpha/2) * SD(zeta) is obtained from the random-effect standard deviation, (iii) recomputing the matching score excluding flagged items (match = "restscore") or including all items (match = "total"), and (iv) refitting until convergence or maxIter iterations.

Note: the estimation process can be long.

Value

A list with components:

model: Fitted glmer object (final iteration).
itemDIF: data.frame with item IDs and random-slope estimates \hat{\zeta}_j.
itemSig: Subset of itemDIF where |\hat{\zeta}_j| > \mathrm{crit}.
crit: Numeric. Decision threshold z_{1-\alpha/2}\times \mathrm{SD}(\zeta).
plot: A ggplot object showing \hat{\zeta}_j with \pm threshold.

Note

The function expects binary responses. Any entry outside {0,1} triggers an error.
With purify = TRUE and match = "restscore", flagged items are excluded from the matching score in subsequent iterations.
Setting nAGQ = 0 can substantially reduce run time at a small accuracy cost.

Author(s)

Sebastien Beland
Faculte des sciences de l'education
Universite de Montreal (Canada)
sebastien.beland@umontreal.ca
Josh Gilbert
Harvard Graduate School of Education
Harvard University (USA)
josh.b.gilbert@gmail.com

References

Gilbert, J. B. (2024). Modeling item-level heterogeneous treatment effects: A tutorial with the glmer function from the lme4 package in R. Behavior Research Methods, 56, 5055–5067. \Sexpr[results=rd]{tools:::Rd_expr_doi("https://doi.org/10.3758/s13428-023-02245-8")}

Examples


## Not run: 
# With real data

data(verbal)
Data <- verbal[,1:24]
group <- verbal[,24]



res1 <- ILHTEdif(
  resp_mat    = Data,
  group       = group,
  alpha       = 0.05
)

# With simulate data, forcing NF = NR

set.seed(2025)
NR <- 300
sim <- SimDichoDif(
  It     = 20,
  ItDIFa = c(2, 5),
  ItDIFb = c(8, 12),
  NR     = NR,
  NF     = NR,         # Same size for NF and NR
  a      = rep(1, 20),
  b      = rnorm(20, 0, 1),
  Ga     = c(0.5, -0.5),
  Gb     = c(1, -1)
)

# Extract response matrix and group vector
resp_mat    <- sim$data[, 1:20]
group       <- factor(sim$data[, 21], labels = c("Ref", "Focal"))
subject_ids <- seq_len(nrow(resp_mat))

# Run the DIF analysis
res2 <- ILHTEdif(
  resp_mat    = resp_mat,
  group       = group,
  subject_ids = subject_ids,
  alpha       = 0.05
)

# With rest score
res3 <- ILHTEdif(
     resp_mat    = resp_mat,
     group       = group,
     subject_ids = subject_ids,
     alpha       = 0.05,
     nAGQ        = 1,
     purify      = FALSE,           # activate purification
     match       = "restscore",    
     maxIter     = 3               # up to 3 purification passes
 )

# With purification

res4 <- ILHTEdif(
     resp_mat    = resp_mat,
     group       = group,
     subject_ids = subject_ids,
     alpha       = 0.05,
     nAGQ        = 1,
     purify      = TRUE,           # activate purification
     match       = "total",    
     maxIter     = 3               # up to 3 purification passes
 )


# View results for res2
print(res2$itemDIF)   # all Zeta estimates
print(res2$itemSig)   # those beyond ±1.96·SD
print(res2$plot)      # plot of Zeta ±1.96·SD

## End(Not run)

difR documentation built on Nov. 29, 2025, 9:06 a.m.