Description Usage Arguments Details Value Author(s) References See Also Examples
Performs DIF detection using MantelHaenszel method.
1 2 3 4 5 6 7 8 9  difMH(Data, group, focal.name , anchor = NULL, match = "score", MHstat = "MHChisq",
correct = TRUE, exact = FALSE, alpha = 0.05, purify = FALSE, nrIter = 10,
p.adjust.method = NULL, save.output = FALSE, output = c("out", "default"))
## S3 method for class 'MH'
print(x, ...)
## S3 method for class 'MH'
plot(x, pch = 8, number = TRUE, col = "red", save.plot = FALSE,
save.options = c("plot", "default", "pdf"), ...)

Data 
numeric: either the data matrix only, or the data matrix plus the vector of group membership. See Details. 
group 
numeric or character: either the vector of group membership or the column indicator (within 
focal.name 
numeric or character indicating the level of 
anchor 
either 
match 
specifies the type of matching criterion. Can be either 
MHstat 
character: specifies the DIF statistic to be used for DIF identification. Possible values are 
correct 
logical: should the continuity correction be used? (default is 
exact 
logical: should an exact test be computed? (default is 
alpha 
numeric: significance level (default is 0.05). 
purify 
logical: should the method be used iteratively to purify the set of anchor items? (default is FALSE). 
nrIter 
numeric: the maximal number of iterations in the item purification process (default is 10). 
p.adjust.method 
either 
save.output 
logical: should the output be saved into a text file? (Default is 
output 
character: a vector of two components. The first component is the name of the output file, the second component is either the file path or

x 
the result from a 
pch, col 
type of usual 
number 
logical: should the item number identification be printed (default is 
save.plot 
logical: should the plot be saved into a separate file? (default is 
save.options 
character: a vector of three components. The first component is the name of the output file, the second component is either the file path or

... 
other generic parameters for the 
The method of MantelHaenszel (1959) allows for detecting uniform differential item functioning without requiring an item response model approach.
The Data
is a matrix whose rows correspond to the subjects and columns to the items. In addition, Data
can hold the vector of group membership.
If so, group
indicates the column of Data
which corresponds to the group membership, either by specifying its name or by giving the column number.
Otherwise, group
must be a vector of same length as nrow(Data)
.
Missing values are allowed for item responses (not for group membership) but must be coded as NA
values. They are discarded from sumscore computation.
The vector of group membership must hold only two different values, either as numeric or character. The focal group is defined by the value of the argument
focal.name
.
The matching criterion can be either the test score or any other continuous or discrete variable to be passed in the mantelHaenszel
function. This is specified by the match
argument. By default, it takes the value "score"
and the test score (i.e. raw score) is computed. The second option is to assign to match
a vector of continuous or discrete numeric values, which acts as the matching criterion. Note that for consistency this vector should not belong to the Data
matrix.
The DIF statistic is specified by the MHstat
argument. By default, MHstat
takes the value "MHChisq"
and the MantelHaenszel chisquare
statistic is used. The other optional value is "logOR"
, and the log oddsratio statistic (that is, the log of alphaMH
divided by the square root
of varLambda
) is used. See Penfield and Camilli (2007), Philips and Holland (1987) and mantelHaenszel
help file.
By default, the asymptotic MantelHaenszel statistic is computed. However, the exact statistics and related Pvalues can
be obtained by specifying the logical argument exact
to TRUE
. See Agresti (1990, 1992) for further
details about exact inference.
The threshold (or cutscore) for classifying items as DIF depends on the DIF statistic. With the MantelHaenszel chisquared statistic (MHstat=="MHChisq"
),
it is computed as the quantile of the chisquare distribution with lowertail probability of one minus alpha
and with one degree of freedom. With
the log oddsratio statistic (MHstat=="logOR"
), it is computed as the quantile of the standard normal distribution with lowertail probability of
1alpha
/2. With exact inference, it is simply the alpha
level since exact Pvalues are returned.
By default, the continuity correction factor 0.5 is used (Holland and Thayer, 1988). One can nevertheless remove it by specifying correct=FALSE
.
In addition, the MantelHaenszel estimates of the common odds ratios α_{MH} are used to measure the effect sizes of the items. These are obtained by
Δ_{MH} = 2.35 \log α_{MH} (Holland and Thayer, 1985). According to the ETS delta scale, the effect size of an item is classified as negligible
if Δ_{MH} ≤q 1, moderate if 1 ≤q Δ_{MH} ≤q 1.5, and large if Δ_{MH} ≥q 1.5. The values of the effect sizes,
together with the ETS classification, are printed with the output. Note that this is returned only for asymptotic tests, i.e. when exact
is FALSE
.
Item purification can be performed by setting purify
to TRUE
. Purification works as follows: if at least one item was detected as functioning
differently at some step of the process, then the data set of the next step consists in all items that are currently anchor (DIF free) items, plus the
tested item (if necessary). The process stops when either two successive applications of the method yield the same classifications of the items (Clauser and
Mazor, 1998), or when nrIter
iterations are run without obtaining two successive identical classifications. In the latter case a warning message is printed.
Adjustment for multiple comparisons is possible with the argument p.adjust.method
. The latter must be an acronym of one of the available adjustment methods of the p.adjust
function. According to Kim and Oshima (2013), Holm and BenjaminiHochberg adjustments (set respectively by "Holm"
and "BH"
) perform best for DIF purposes. See p.adjust
function for further details. Note that item purification is performed on original statistics and pvalues; in case of adjustment for multiple comparisons this is performed after item purification.
A prespecified set of anchor items can be provided through the anchor
argument. It must be a vector of either item names (which must match exactly the column names of Data
argument) or integer values (specifying the column numbers for item identification). In case anchor items are provided, they are used to compute the test score (matching criterion), including also the tested item. None of the anchor items are tested for DIF: the output separates anchor items and tested items and DIF results are returned only for the latter. Note also that item purification is not activated when anchor items are provided (even if purify
is set to TRUE
). By default it is NULL
so that no anchor item is specified.
The output of the difMH
, as displayed by the print.MH
function, can be stored in a text file provided that save.output
is set to TRUE
(the default value FALSE
does not execute the storage). In this case, the name of the text file must be given as a character string into the first component
of the output
argument (default name is "out"
), and the path for saving the text file can be given through the second component of output
. The
default value is "default"
, meaning that the file will be saved in the current working directory. Any other path can be specified as a character string:
see the Examples section for an illustration.
The plot.MH
function displays the DIF statistics in a plot, with each item on the X axis. The type of point and the color are fixed by the usual pch
and col
arguments. Option number
permits to display the item numbers instead. Also, the plot can be stored in a figure file, either in PDF or JPEG
format. Fixing save.plot
to TRUE
allows this process. The figure is defined through the components of save.options
. The first two components
perform similarly as those of the output
argument. The third component is the figure format, with allowed values "pdf"
(default) for PDF file and
"jpeg"
for JPEG file. Note that no plot is returned for exact inference.
A list of class "MH" with the following arguments:
MH 
the values of the MantelHaenszel DIF statistics (either exact or asymptotic). 
p.value 
the pvalues for the MantelHaenszel statistics (either exact or asymptotic). 
alphaMH 
the values of the mantelHaenszel estimates of common odds ratios. Returned only if 
varLambda 
the values of the variances of the log oddsratio statistics. Returned only if 
MHstat 
the value of the 
alpha 
the value of 
thr 
the threshold (cutscore) for DIF detection. Returned only if 
DIFitems 
either the column indicators of the items which were detected as DIF items, or "No DIF item detected". 
correct 
the value of 
exact 
the value of 
match 
a character string, either 
p.adjust.method 
the value of the 
adjusted.p 
either 
purification 
the value of 
nrPur 
the number of iterations in the item purification process. Returned only if 
difPur 
a binary matrix with one row per iteration in the item purification process and one column per item. Zeros and ones in the ith
row refer to items which were classified respectively as nonDIF and DIF items at the (i1)th step. The first row corresponds to the initial
classification of the items. Returned only if 
convergence 
logical indicating whether the iterative item purification process stopped before the maximal number 
names 
the names of the items. 
anchor.names 
the value of the 
save.output 
the value of the 
output 
the value of the 
Sebastien Beland
Collectif pour le Developpement et les Applications en Mesure et Evaluation (Cdame)
Universite du Quebec a Montreal
sebastien.beland.1@hotmail.com, http://www.cdame.uqam.ca/
David Magis
Department of Psychology, University of Liege
Research Group of Quantitative Psychology and Individual Differences, KU Leuven
David.Magis@uliege.be, http://ppw.kuleuven.be/okp/home/
Gilles Raiche
Collectif pour le Developpement et les Applications en Mesure et Evaluation (Cdame)
Universite du Quebec a Montreal
raiche.gilles@uqam.ca, http://www.cdame.uqam.ca/
Agresti, A. (1990). Categorical data analysis. New York: Wiley.
Agresti, A. (1992). A survey of exact inference for contingency tables. Statistical Science, 7, 131177. doi: 10.1214/ss/1177011454
Holland, P. W. and Thayer, D. T. (1985). An alternative definition of the ETS delta scale of item difficulty. Research Report RR8543. Princeton, NJ: Educational Testing Service.
Holland, P. W. and Thayer, D. T. (1988). Differential item performance and the MantelHaenszel procedure. In H. Wainer and H. I. Braun (Ed.), Test validity. Hillsdale, NJ: Lawrence Erlbaum Associates.
Kim, J., and Oshima, T. C. (2013). Effect of multiple testing adjustment in differential item functioning detection. Educational and Psychological Measurement, 73, 458–470. doi: 10.1177/0013164412467033
Magis, D., Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847862. doi: 10.3758/BRM.42.3.847
Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719748.
Penfield, R. D., and Camilli, G. (2007). Differential item functioning and item bias. In C. R. Rao and S. Sinharray (Eds.), Handbook of Statistics 26: Psychometrics (pp. 125167). Amsterdam, The Netherlands: Elsevier.
Philips, A., and Holland, P. W. (1987). Estimators of the MantelHaenszel log oddsratio estimate. Biometrics, 43, 425431. doi: 10.2307/2531824
Raju, N. S., Bode, R. K. and Larsen, V. S. (1989). An empirical assessment of the MantelHaenszel statistic to detect differential item functioning. Applied Measurement in Education, 2, 113. doi: 10.1207/s15324818ame0201_1
Uttaro, T. and Millsap, R. E. (1994). Factors influencing the MantelHaenszel procedure in the detection of differential item functioning. Applied Psychological Measurement, 18, 1525. doi: 10.1177/014662169401800102
mantelHaenszel
, dichoDif
, p.adjust
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  ## Not run:
# Loading of the verbal data
data(verbal)
# Excluding the "Anger" variable
verbal < verbal[colnames(verbal) != "Anger"]
# Three equivalent settings of the data matrix and the group membership
r < difMH(verbal, group = 25, focal.name = 1)
difMH(verbal, group = "Gender", focal.name = 1)
difMH(verbal[,1:24], group = verbal[,25], focal.name = 1)
# With log oddsratio statistic
r2 < difMH(verbal, group = 25, focal.name = 1, MHstat = "logOR")
# With exact inference
difMH(verbal, group = 25, focal.name = 1, exact = TRUE)
# Multiple comparisons adjustment using BenjaminiHochberg method
difMH(verbal, group = 25, focal.name = 1, p.adjust.method = "BH")
# With item purification
difMH(verbal, group = "Gender", focal.name = 1, purify = TRUE)
difMH(verbal, group = "Gender", focal.name = 1, purify = TRUE, nrIter = 5)
# Without continuity correction and with 0.01 significance level
difMH(verbal, group = "Gender", focal.name = 1, alpha = 0.01, correct = FALSE)
# With items 1 to 5 set as anchor items
difMH(verbal, group = "Gender", focal.name = 1, anchor = 1:5)
difMH(verbal, group = "Gender", focal.name = 1, anchor = 1:5, purify = TRUE)
# Saving the output into the "MHresults.txt" file (and default path)
r < difMH(verbal, group = 25, focal.name = 1, save.output = TRUE,
output = c("MHresults","default"))
# Graphical devices
plot(r)
plot(r2)
# Plotting results and saving it in a PDF figure
plot(r, save.plot = TRUE, save.options = c("plot", "default", "pdf"))
# Changing the path, JPEG figure
path < "c:/Program Files/"
plot(r, save.plot = TRUE, save.options = c("plot", path, "jpeg"))
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.