lassoDIF.CV: Detection of Differential Item Functioning Using the Lasso...

View source: R/lassoDIF.CV.R

lassoDIF.CVR Documentation

Detection of Differential Item Functioning Using the Lasso Approach: Selection of Optimal \lambda via Cross-Validation

Description

Performs DIF detection using a lasso-penalized logistic regression model for dichotomous items and selects the optimal penalty parameter \lambda via cross-validation.

Usage

lassoDIF.CV(Data, group, nfold = 5, lambda = NULL, ...)

Arguments

...

Additional arguments passed to internal methods.

Data

A numeric data frame or matrix: either only the item responses or the item responses with a group membership column.

group

A numeric or character vector: either a vector of group membership or a column index/name indicating group membership in Data.

nfold

Integer: the number of folds used in cross-validation. Default is 5.

lambda

Optional numeric vector of \lambda values to be used in the penalization path. If NULL, a default sequence is used.

Details

This function detects uniform differential item functioning (DIF) using a lasso-penalized logistic regression model and selects the penalty parameter \lambda^* that minimizes cross-validation error. For this selected value, the function returns the estimated DIF parameters for all items and flags those with non-zero DIF effects.

Note: The performance of the method depends on choices such as the number of folds and the grid of \lambda values. We strongly recommend testing different configurations to assess the robustness of the results before interpretation.

Value

A list with the following components:

DIFitems

Indices of items flagged as exhibiting DIF.

DIFpars

Matrix of estimated DIF parameters for each item.

crit.value

Cross-validation criterion values (deviance) across the \lambda path.

crit.type

The type of criterion used, here "cv".

lambda

Vector of \lambda values considered.

opt.lambda

The optimal \lambda value selected via cross-validation.

glmnet.fit

Fitted glmnet model object.

Author(s)

David Magis
Data science consultant at IQVIA Belux
Brussels, Belgium
Carl F. Falk
Department of Psychology
McGill University (Canada)
carl.falk@mcgill.ca, https://www.mcgill.ca/psychology/carl-f-falk
Sebastien Beland
Faculte des sciences de l'education
Universite de Montreal (Canada)
sebastien.beland@umontreal.ca

References

Magis, D., Tuerlinckx, F., & De Boeck, P. (2015). Detection of Differential Item Functioning Using the Lasso Approach. Journal of Educational and Behavioral Statistics, 40(2), 111–135. https://doi.org/10.3102/1076998614559747

Examples

## Not run: 

# With the Verbal data set

data(verbal)

Dat    <-verbal[,1:20]
Member <-verbal[,26]

# Using cross-validation
set.seed(1234) 

cv.res <- lassoDIF.CV(Dat, Member, nfold=5)
cv.res

# With simulated data

It   <- 15 # number of items
ItDIFa <- NULL
ItDIFb <- c(1,3)
NR   <- 100 # number of responses for group 1 (reference)
NF   <- 100 # number of responses for group 2 (focal)
a    <- rep(1,It)          # for tests: runif(It,0.2,.5)  
b    <- rnorm(It,1,.5)  
Gb   <- rep(2,2)           # Group value for U-DIF
Ga   <- 0                  # Group value for NU-DIF: need to be fix to 0 for U-DIF
Out1 <- SimDichoDif(It,ItDIFa,ItDIFb,NR,NF,a,b,Ga,Gb)
Dat<-Out1$data[,1:15]
Member<-Out1$data[,16]

set.seed(1234) # appears to be sensitive to random number seed

cv.res <- lassoDIF.CV(Dat, Member, nfold=5)
cv.res

 
## End(Not run)
 

difR documentation built on June 8, 2025, 1:03 p.m.