knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(tibble.print_min = 10, tibble.print_max = 20, pillar.min_title_chars = 16)
ItemResponseTrees is an R package that allows to fit IR-tree models in mirt, Mplus, or TAM. If you're unfamiliar with IR-trees, the papers of Böckenholt (2012) and Plieninger (2020) are good starting points. If you're familiar with the class of IR-tree models, this vignette will get you started on fitting your models with ItemResponseTrees.
The package automates some of the hassle of IR-tree modeling by means of a consistent syntax. This allows new users to quickly adopt this model class, and this allows experienced users to fit many complex models effortlessly.
Herein, an illustrative example will be shown using a popular IR-tree model for 5-point items (Böckenholt, 2012). The model is applied to a Big Five data set from Jackson (2012), more precisely to nine conscientiousness items.
library("ItemResponseTrees") data("jackson") set.seed(9701) df1 <- jackson[sample(nrow(jackson), 321), paste0("E", 1:9)] df1
The model is defined by the following tree diagram. The model equations can be directly derived from the diagram: The probability for a specific category is given by multiplying all parameters along the branch that lead to that category (see also Böckenholt, 2012; Plieninger, 2020). For example, the branch leading to Category 5 is comprised of the parameters (1-m), t, and e. The resulting five equations are shown in the right part of the figure.
In the ItemResponseTrees package, a model is defined using a specific syntax that consists mainly of three parts.
Equations:Herein, the equation for each response category is listed in the format
cat = p1 * (1-p2), where
catis one of the observed responses (e.g., 1, ..., 5). Furthermore,
p1is a freely chosen parameter label, and I've chosen
mbelow corresponding to the diagram above.
IRT:The parameters in the
Equations(and also those in the figure above) actually correspond to latent variables of an IRT model. These latent variables are measured using a number of items/variables, and this is specified in this section using the same parameter labels as in
@. The syntax below fixes all loadings corresponding to dimensions e and m to 1 corresponding to a 1PL or Rasch model, whereas all loadings corresponding to dimension t are freely estimated (i.e., 2PL-structure).
Class:Can be either
Treefor an IR-tree model or
GRMfor a graded response model.
In the following code chunk, the model string for the IR-tree model depicted above is specified and saved as
The helper function
irtree_create_template() can assist you in creating such a model string (especially if you know the "pseudoitems").
m1 <- " # IR-tree model for 5-point items (Böckenholt, 2012) Equations: 1 = (1-m)*(1-t)*e 2 = (1-m)*(1-t)*(1-e) 3 = m 4 = (1-m)*t*(1-e) 5 = (1-m)*t*e IRT: t BY E1, E2, E3, E4, E5, E6, E7, E8, E9; e BY E1@1, E2@1, E3@1, E4@1, E5@1, E6@1, E7@1, E8@1, E9@1; m BY E1@1, E2@1, E3@1, E4@1, E5@1, E6@1, E7@1, E8@1, E9@1; Class: Tree "
In case of a graded response model, only two sections of the model string need to be specified.
m2 <- " # Graded response model IRT: t BY E1, E2, E3, E4, E5, E6, E7, E8, E9; Class: GRM "
Subsequently, the function
irtree_model() needs to be called, which takes a model string such as
m2 as its sole argument.
The resulting objects
model2 of class
irtree_model contain all the necessary information for fitting the model.
Furthermore, one may inspect specific elements, for example, the pseudoitems contained in the mapping matrix.
Further information on creating model strings is provided in
model1 <- irtree_model(m1) model2 <- irtree_model(m2) model1$mapping_matrix
Then, the model can be
fit() using one of three different engines.
The ItemResponseTrees package supports the engines mirt, TAM, and Mplus (via the MplusAutomation package).
Additional arguments for the engine, for example, details of the algorithms, can be specified via the
# mirt can be used with an EM algorithm (the default) or, for example, with the # MH-RM algorithm, which seems a little bit faster here. # See ?mirt::mirt for details. ctrl <- control_mirt(method = "MHRM") fit1 <- fit(model1, data = df1, engine = "mirt", control = ctrl) fit2 <- fit(model2, data = df1, engine = "mirt", control = ctrl)
The easiest way to access the information stored in
fit2 is via the functions
augment() (that come from the broom package, which is part of the tidyverse).
Information about model fit is obtained via
As seen below, the IR-tree model has 41 freely estimated parameters (3 x 9 thresholds + 9 loadings + 2 variances + 3 covariances).
The GRM has 45 estimated parameters (4 x 9 thresholds + 9 loadings).
(Of course, this comparison is a little bit unfair, because the IR-tree model is much more flexible in terms of dimensionality/"random effects" even though it is less flexible with respect to the thresholds/"fixed effects".)
For the present data, the IR-tree model slightly outperforms the GRM according to AIC and BIC, and thus one may conclude that response styles are present in these data.
glance(fit1) rbind(glance(fit1), glance(fit2))
The parameter estimates are obtained via
For the IR-tree model, this returns a tibble with 66 rows (pertaining to the fixed and estimated parameters).
Below, the nine threshold/difficulty parameters
t_E*.d pertaining to parameter t are shown plus the threshold of pseudoitem
The latent variances, covariances, and correlations are shown below as well, and these show the typical pattern of a negative correlation between e and m.^[The order of the processes corresponds to the order of appearance in the section
IRT of the model string.
Thus, the order here is t, e, m, such that
COV_33 is the variance of person parameters for m, and
CORR_32 is the correlation between m and e.]
tidy(fit1, par_type = "difficulty") tail(tidy(fit1, par_type = "difficulty"), 9)
The factor scores or person parameter estimates are obtained via
This returns a tibble comprised of the data set, the factor scores (e.g.,
.fitted.t), and respective standard errors (e.g.,
The correlation of the scores for the target trait (extraversion in this case) between the IR-tree model and the GRM indicates that the models differ in this respect even though not drastically.
augment(fit1) cor(augment(fit1)$.fitted.t, augment(fit2)$.fitted.t)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.