| difLogistic | R Documentation |
Performs DIF detection using logistic regression method.
difLogistic(
Data,
group,
focal.name,
anchor = NULL,
member.type = "group",
match = "score",
type = "both",
criterion = "LRT",
alpha = 0.05,
all.cov = FALSE,
purify = FALSE,
nrIter = 10,
p.adjust.method = NULL,
puriadjType = "simple",
save.output = FALSE,
output = c("out", "default")
)
## S3 method for class 'Logistic'
plot(
x,
plot = "lrStat",
item = 1,
itemFit = "best",
pch = 8,
number = TRUE,
col = "red",
colIC = rep("black", 2),
ltyIC = c(1, 2),
save.plot = FALSE,
save.options = c("plot", "default", "pdf"),
group.names = NULL,
...
)
Data |
numeric: either the data matrix only, or the data matrix plus the vector of group membership. See Details. |
group |
numeric or character: either the vector of group membership or
the column indicator (within |
focal.name |
numeric or character indicating the level of |
anchor |
either |
member.type |
character: either |
match |
specifies the type of matching criterion. Can be either
|
type |
a character string specifying which DIF effects must be tested.
Possible values are |
criterion |
a character string specifying which DIF statistic is
computed. Possible values are |
alpha |
numeric: significance level (default is 0.05). |
all.cov |
logical: should all covariance matrices of model
parameter estimates be returned (as lists) for both nested models and all
items? (default is |
purify |
logical: should the method be used iteratively to purify the
set of anchor items? (default is |
nrIter |
numeric: the maximal number of iterations in the item purification process. (default is 10). |
p.adjust.method |
either |
puriadjType |
character: type of combination of the item purification
and the method for p-value adjustment for multiple comparisons. Either
|
save.output |
logical: should the output be saved into a text file?
(default is |
output |
character: a vector of two components. The first component is
the name of the output file, the second component is either the file path
or |
x |
the result from a |
plot |
character: the type of plot, either |
item |
numeric or character: either the number or the name of the item
for which logistic curves are plotted. Used only when |
itemFit |
character: the model to be selected for drawing the item
curves. Possible values are |
pch, col |
type of usual |
number |
logical: should the item number identification be printed
(default is |
colIC, ltyIC |
vectors of two elements of the usual |
save.plot |
logical: should the plot be saved into a separate file?
(default is |
save.options |
character: a vector of three components. The first
component is the name of the output file, the second component is either
the file path or |
group.names |
either |
... |
other generic parameters for the |
The logistic regression method (Swaminathan & Rogers, 1990) allows for
detecting both uniform and non-uniform differential item functioning
without requiring an item response model approach. It consists in fitting a
logistic model with the matching criterion, the group membership and an
interaction between both as covariates. The statistical significance of the
parameters related to group membership and the group-score interaction is
then evaluated by means of either the likelihood-ratio test or the Wald
test. The argument type permits to test either both uniform and
nonuniform effects simultaneously (type = "both"), only uniform DIF
effect (type = "udif") or only nonuniform DIF effect
(type = "nudif"). The argument criterion permits to select
either the likelihood ratio test (criterion = "LRT") or the Wald test
(criterion = "Wald"). See Logistik for further details.
The group membership can be either a vector of two distinct values, one for
the reference group and one for the focal group, or a continuous or
discrete variable that acts as the "group" membership variable. In the
former case, the member.type argument is set to "group" and
the focal.name defines which value in the group variable
stands for the focal group. In the latter case, member.type is set
to "cont", focal.name is ignored and each value of the
group represents one "group" of data (that is, the DIF effects are
investigated among participants relying on different values of some
discrete or continuous trait). See Logistik for further
details.
The matching criterion can be either the test score or any other continuous
or discrete variable to be passed in the Logistik function.
This is specified by the match argument. By default, it takes the
value "score" and the test score (i.e. raw score) is computed. The
second option is to assign to match a vector of continuous or
discrete numeric values, which acts as the matching criterion. Note that
for consistency this vector should not belong to the Data matrix.
The Data is a matrix whose rows correspond to the subjects and
columns to the items. In addition, Data can hold the vector of group
membership. If so, group indicates the column of Data which
corresponds to the group membership, either by specifying its name or by
giving the column number. Otherwise, group must be a vector of same
length as nrow(Data).
Missing values are allowed for item responses (not for group membership)
but must be coded as NA values. They are discarded from the fitting
of the logistic models (see glm for further details).
The threshold (or cut-score) for classifying items as DIF is computed as
the quantile of the chi-squared distribution with lower-tail probability of
one minus alpha and with one (if type = "udif" or
type = "nudif") or two (if type = "both") degrees of freedom.
Item purification can be performed by setting purify to TRUE.
Purification works as follows: if at least one item is detected as
functioning differently at the first step of the process, then the data set
of the next step consists in all items that are currently anchor (DIF free)
items, plus the tested item (if necessary). The process stops when either
two successive applications of the method yield the same classifications of
the items (Clauser & Mazor, 1998), or when nrIter iterations are
run without obtaining two successive identical classifications. In the
latter case a warning message is printed. Note that purification is
possible only if the test score is considered as the matching criterion.
Thus, purify is ignored when match is not "score".
Adjustment for multiple comparisons is possible with the argument
p.adjust.method. The latter must be an acronym of one of the
available adjustment methods of the p.adjust function.
According to Kim and Oshima (2013), Holm and Benjamini-Hochberg adjustments
(set respectively by "Holm" and "BH") perform best for DIF
purposes. See p.adjust function for further details. Note
that item purification is performed on original statistics and p-values; in
case of adjustment for multiple comparisons this is performed after
item purification.
A pre-specified set of anchor items can be provided through the
anchor argument. It must be a vector of either item names (which
must match exactly the column names of Data argument) or integer
values (specifying the column numbers for item identification). In case
anchor items are provided, they are used to compute the test score
(matching criterion), including also the tested item. None of the anchor
items are tested for DIF: the output separates anchor items and tested
items and DIF results are returned only for the latter. By default it is
NULL so that no anchor item is specified. Note also that item
purification is not activated when anchor items are provided (even if
purify is set to TRUE). Moreover, if the match
argument is not set to "score", anchor items will not be taken into
account even if anchor is not NULL.
The measures of effect size are provided by the difference \Delta R^2
between the R^2 coefficients of the two nested models (Nagelkerke,
1991; Gomez-Benito, Dolores Hidalgo & Padilla, 2009). The effect sizes
are classified as "negligible", "moderate" or "large". Two scales are
available, one from Zumbo and Thomas (1997) and one from Jodoin and Gierl
(2001). The output displays the \Delta R^2 measures, together with
the two classifications.
The output of the difLogistic() function, as displayed by the
print.Logistic function, can be stored in a text file provided that
save.output is set to TRUE (the default value FALSE
does not execute the storage). In this case, the name of the text file must
be given as a character string into the first component of the
output argument (default name is "out"), and the path for
saving the text file can be given through the second component of
output. The default value is "default", meaning that the file
will be saved in the current working directory. Any other path can be
specified as a character string: see the Examples section for an
illustration.
Two types of plots are available. The first one is obtained by setting
plot = "lrStat" and it is the default option. The likelihood ratio
statistics are displayed on the Y axis, for each item. The detection
threshold is displayed by a horizontal line, and items flagged as DIF are
printed with the color defined by argument col. By default, items
are spotted with their number identification (number = TRUE);
otherwise they are simply drawn as dots whose form is given by the option
pch.
The other type of plot is obtained by setting plot = "itemCurve". In
this case, the fitted logistic curves are displayed for one specific item
set by the argument item. The latter argument can hold either the
name of the item or its number identification. If the argument
itemFit takes the value "best", the curves are drawn
according to the output of the best model among M_0 and M_1.
That is, two curves are drawn if the item is flagged as DIF, and only one
if the item is flagged as non-DIF. If itemFit takes the value
"null", then the two curves are drawn from the fitted parameters of
the null model M_0. See Logistik for further details on
the models. The colors and types of traits for these curves are defined by
means of the arguments colIC and ltyIC respectively. These
are set as vectors of length 2, the first element for the reference group
and the second for the focal group. Finally, the argument
group.names permits to display the names of the reference and focal
groups (instead of "Reference" and "Focal") in the legend.
Both types of plots can be stored in a figure file, either in PDF or JPEG
format. Fixing save.plot to TRUE allows this process. The
figure is defined through the components of save.options. The first
two components perform similarly as those of the output argument.
The third component is the figure format, with allowed values "pdf"
(default) for PDF file and "jpeg" for JPEG file.
A list of class "Logistic" with the following arguments:
the values of the logistic regression statistics.
the vector of p-values for the logistic regression statistics.
a matrix with one row per item and four columns, holding the fitted parameters of the best model (among the two tested models) for each item.
a matrix with one row per item and four columns, holding the standard errors of the fitted parameters of the best model (among the two tested models) for each item.
the matrix of fitted parameters of the null model M_0, as returned by the Logistik command.
the matrix of standard error of fitted parameters of the null model M_0, as returned by the Logistik command.
either NULL (if all.cov argument is FALSE) or a list of covariance matrices of parameter estimates of the "full" model (M_0) for each item (if all.cov argument is TRUE).
either NULL (if all.cov argument is FALSE) or a list of covariance matrices of parameter estimates of the "reduced" model (M_1) for each item (if all.cov argument is TRUE).
the differences in Nagelkerke's R^2 coefficients. See Details.
the value of alpha argument.
the threshold (cut-score) for DIF detection.
either the column indicators for the items which were detected as DIF items, or "No DIF item detected".
the value of the member.type argument.
a character string, either "score" or "matching variable" depending on the match argument.
the value of type argument.
the value of the p.adjust.method argument.
either NULL or the vector of adjusted p-values for multiple comparisons.
the value of purify option.
the number of iterations in the item purification process. Returned only if purify is TRUE.
a binary matrix with one row per iteration in the item purification process and one column per item. Zeros and ones in the i-th row refer to items which were classified respectively as non-DIF and DIF items at the (i-1)-th step. The first row corresponds to the initial classification of the items. Returned only if purify is TRUE.
logical indicating whether the iterative item purification process stopped before the maximal number of nrItem allowed iterations. Returned only if purify is TRUE.
the value of puriadjType option. Returned only when purify is TRUE.
the names of the items.
the value of the anchor argument.
the value of the criterion argument.
the value of the save.output argument.
the value of the output argument.
David Magis
Data science consultant at IQVIA Belux
Brussels, Belgium
Sebastien Beland
Faculte des sciences de l'education
Universite de Montreal (Canada)
sebastien.beland@umontreal.ca
Gilles Raiche
Universite du
Quebec a Montreal
raiche.gilles@uqam.ca
Adela Hladka (nee Drabinova)
Institute of Computer Science of the Czech Academy of Sciences
hladka@cs.cas.cz
Clauser, B.E. and Mazor, K.M. (1998). Using statistical procedures to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17, 31–44.
Finch, W.H. and French, B. (2007). Detection of crossing differential item functioning: a comparison of four methods. Educational and Psychological Measurement, 67, 565–582, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0013164406296975")}
Gomez-Benito, J., Dolores Hidalgo, M. and Padilla, J.-L. (2009). Efficacy of effect size measures in logistic regression: An application for detecting DIF. Methodology, 5, 18–25, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1027/1614-2241.5.1.18")}
Hidalgo, M. D. and Lopez-Pina, J.A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and Mantel-Haenszel procedures. Educational and Psychological Measurement, 64, 903–915, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0013164403261769")}
Hladká, A., Martinková, P., and Magis, D. (2023). Combining item purification and multiple comparison adjustment methods in detection of differential item functioning. Multivariate Behavioral Research, 59(1), 46–61, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00273171.2023.2205393")}
Jodoin, M. G. and Gierl, M. J. (2001). Evaluating Type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329–349, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1207/S15324818AME1404_2")}
Kim, J., and Oshima, T. C. (2013). Effect of multiple testing adjustment in differential item functioning detection. Educational and Psychological Measurement, 73, 458–470, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/0013164412467033")}
Magis, D., Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847–862, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3758/BRM.42.3.847")}
Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691–692, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biomet/78.3.691")}
Swaminathan, H. and Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/j.1745-3984.1990.tb00754.x")}
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF): logistic regression modelling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zumbo, B. D. and Thomas, D. R. (1997). A measure of effect size for a model-based approach for studying DIF. Prince George, Canada: University of Northern British Columbia, Edgeworth Laboratory for Quantitative Behavioral Science.
Logistik, dichoDif
## Not run:
# Loading of the verbal data
data(verbal)
# Excluding the "Anger" variable
anger <- verbal[, colnames(verbal) == "Anger"]
verbal <- verbal[, colnames(verbal) != "Anger"]
# Testing both DIF effects simultaneously
# Three equivalent settings of the data matrix and the group membership
r <- difLogistic(verbal, group = 25, focal.name = 1)
difLogistic(verbal, group = "Gender", focal.name = 1)
difLogistic(verbal[, 1:24], group = verbal[, 25], focal.name = 1)
# Returning all covariance matrices of model parameters
difLogistic(verbal, group = 25, focal.name = 1, all.cov = TRUE)
# Testing both DIF effects with the Wald test
r2 <- difLogistic(verbal, group = 25, focal.name = 1, criterion = "Wald")
# Testing nonuniform DIF effect
difLogistic(verbal, group = 25, focal.name = 1, type = "nudif")
# Testing uniform DIF effect
difLogistic(verbal, group = 25, focal.name = 1, type = "udif")
# Multiple comparisons adjustment using Benjamini-Hochberg method
difLogistic(verbal, group = 25, focal.name = 1, p.adjust.method = "BH")
# With item purification
difLogistic(verbal, group = "Gender", focal.name = 1, purify = TRUE)
difLogistic(verbal, group = "Gender", focal.name = 1, purify = TRUE, nrIter = 5)
# With combination of item purification and multiple comparisons adjustment
difLogistic(verbal, group = "Gender", focal.name = 1, purify = TRUE,
p.adjust.method = "BH", puriadjType = "simple")
difLogistic(verbal, group = "Gender", focal.name = 1, purify = TRUE,
p.adjust.method = "BH", puriadjType = "combined")
# With items 1 to 5 set as anchor items
difLogistic(verbal, group = 25, focal.name = 1, anchor = 1:5)
# Using anger trait score as the matching criterion
difLogistic(verbal,group = 25, focal.name = 1, match = anger)
# Using trait anger score as the group variable (i.e. testing
# for DIF with respect to trait anger score)
difLogistic(verbal[, 1:24], group = anger, member.type = "cont")
# Saving the output into the "Lresults.txt" file (and default path)
r <- difLogistic(verbal, group = 25, focal.name = 1, save.output = TRUE,
output = c("Lresults", "default"))
# Graphical devices
plot(r)
plot(r2)
plot(r, plot = "itemCurve", item = 1)
plot(r, plot = "itemCurve", item = 1, itemFit = "null")
plot(r, plot = "itemCurve", item = 6)
plot(r, plot = "itemCurve", item = 6, itemFit = "null")
# Plotting results and saving it in a PDF figure
plot(r, save.plot = TRUE, save.options = c("plot", "default", "pdf"))
# Changing the path, JPEG figure
path <- "c:/Program Files/"
plot(r, save.plot = TRUE, save.options = c("plot", path, "jpeg"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.