Using `jrt`

  fig.height = 5,
    fig.width = 7.3,
  collapse = TRUE,
  comment = "#>"

This package provides user-friendly functions designed for the easy implementation of Item-Response Theory (IRT) models and scoring with judgment data. Although it can be used in a variety of contexts, the original motivation for implementation is to facilitate use for creativity researchers.


jrt is not an estimation package, it provides wrapper functions that call estimation packages and extract/report/plot information from them. At this stage, jrt uses the (excellent) package mirt (Chalmers, 2012) as its only IRT engine. Thus, if you use jrt for your research, please ensure to cite mirt as the estimation package/engine:

We also encourage that you cite jrt – especially if you use the plots or the automatic model selection. Currently, this would be done with:

Ok now let's get started...

What the data should look like

Then, a judgment data.frame would be provided to the function jrt. Here we'll use the simulated one in jrt::ratings.

data <- jrt::ratings

It looks like this:


jrt is in development and these features will hopefully appear soon (check back !), but in this release:

I know, that's a lot that you can't do...but this covers the typical cases, at least for the Consensual Assessment Technique -- which is why it was originally created.

Model fitting, scoring and statistics with jrt()

You will first want to first load the library.


The main function of the jrt package is jrt(). By default, this function will:

Let's do it!

fit <- jrt(data, = F)

Of course there's more available here than one would report. If using IRT scoring (which is the main purpose of this package), we recommend reporting what IRT model was selected, along with IRT indices primarily, since the scoring is based on the estimation of the $\theta$ abilities. In this case typically what is reported in the empirical reliability (here r round(fit@empirical.reliability, 3)), which is the estimate of the reliability of the observations in the sample. It can be interpreted similarily as other more traditionnal indices of reliability (like Cronbach's $\alpha$).

fit <- jrt(data, silent = T)

One may of course select a model based on assumptions on the data rather than on model fit comparisons. This is done through using the name of a model as an imput of the argument irt.model of the jrt() function. This bypasses the automatic model selection stage.

fit <- jrt(data, "PCM")

See the documentation for a list of available models. Most models are directly those of mirt. Others are versions of the Graded Response Model or Generalized Partial Credit Model that are constrained in various ways (equal discriminations and/or equal category structures) through the mirt.model() function of mirt.

Note that they can also be called by their full names (e.g. jrt(data, "Graded Response Model")).


Note : If you want a more complete output with the original data, use If there were missing data, also appends imputed data.


Judge Category Curves

Judge characteristics can be inspected with Judge Category Curve (JCC) plots. They are computed with the function jcc.plot().

A basic example for Judge 3...

jcc.plot(fit, judge = 3)

Now of course, there are many options, but a few things that you could try:

jcc.plot(fit, judge = c(1,6))
jcc.plot(fit, facet.cols = 2)
jcc.plot(fit, 1, greyscale = T)
jcc.plot(fit, 1, overlay.reliability = TRUE)
jcc.plot(fit, overlay.reliability = T, labelled = F)
jcc.plot(fit, overlay.reliability = T, labelled = F, legend.position = "bottom")
jcc.plot(fit, 2, column.names = "Expert")
jcc.plot(fit, 3:4,
         manual.facet.names = paste("Expert ", c("A", "B", "C", "D", "E", "F")),
         manual.line.names = c("Totally disagree", "Disagree", "Neither agree\nnor disagree", "Agree", "Totally agree"),
         labelled = F)
jcc.plot(fit, 1, title = "")
jcc.plot(fit, 1, theta.span = 5)
jcc.plot(fit, 1:4,
         labelled = F,
         line.opacity = c(0,0,0,1,0,0) # Highlighting the 4th category
jcc.plot(fit, 1, color.palette = "Dark2", theme = "classic", line.width = 1.5, = "serif", overlay.reliability = T, name.for.reliability = "Reliability")
jcc.plot(fit, 1:3, labelled = F, line.opacity = c(0,0,0,1,0,0))


jcc.plot(fit, 1, color.palette = "Blues", theme = "grey", line.width = 3, labelled = F)

I've also integrated the colors of the ggsci package (npg, aaas, nejm, lancet, jama, jco, D3, locuszoom, igv, uchicago, startrek, tron, futurama), but be careful, not all may have sufficient color values!

jcc.plot(fit, 1, color.palette = "npg", overlay.reliability = T)

Information Plots

The jrt() function already plots an information plot, but information plots can be called (as well as variants of information, like standard error and reliability), with the info.plot() function.

info.plot(fit, 1)

(type = "reliability" also works)

info.plot(fit, type = "r")
info.plot(fit, type = "se")

(type = "Standard Error" also works)

info.plot(fit, type = "r", y.limits = c(0,1))
info.plot(fit, type = "r", y.line = .70)
info.plot(fit, type = "ise")

With a threshold value

info.plot(fit, type = "ir", y.line = .7)

And here again, themes are available.

info.plot(fit, type = "ir", y.line = .7, color.palette = "Dark2")

Similar customizing options than jcc.plot() are available, here is an example:

info.plot(fit, 1, "ir",
          column.names = "Rater",
          theta.span = 5,
          theme = "classic",
          line.width = 2,
          greyscale = T,
 = "serif")

Dealing with unobserved categories and Rating Scale Models

Some polytomous IRT models (namely, the Rating Scale models) assume that judges all have the same response category structure, and so they cannot be estimated if all judges do not have the same observed categories. So, if your data includes judges with unobserved categories, how does jrt deal with that?

For the automatic model selection stage, jrt will by default keep all judges but, if there are judges with unobserved categories, it will not fit the Rating Scale and Generalized Rating Scale models. You will be notified in the output.

N <- 100
judges <- 8
diffs <- t(apply(matrix(runif(judges*4, .4, 5), judges), 1, cumsum))
d <- -(diffs - rowMeans(diffs)) + stats::rnorm(judges, mean = 0, sd= 1)
data <- mirt::simdata(matrix(rlnorm(judges,1,0)), d, N,
itemtype = 'graded') + 1
colnames(data) <- paste("Judge_", 1:dim(data)[2], sep = "")

Note : The possible values are automatically detected, but it can be bypassed with the possible.values argument.

Here's an example on a data set where a judge had unobserved categories. By default the set of candidate models will exclude rating scale models (note in the plot that the last judge has an uboserved category).

fit <- jrt(data, 
  = F, #removing the progress bar for the example
           plots = F) 

Now, if you want instead to remove the incomplete judges to compare the models, set remove.judges.with.unobserved.categories = TRUE (it's a long name for an argument, so if you have a better idea of a clear but shorter name shoot me an email!). Now all models will be compared, but with only the complete judges.

After this stage:

An example with the same data as above but with remove.judges.with.unobserved.categories = TRUE. Here, since the best fitting model was the Constrained Graded Response Model (not a Rating Scale Model), then the model is fit again with all judges (hence the different AIC between the two stages).

fit <- jrt(data, 
           remove.judges.with.unobserved.categories = T,
  = F, #removing the progress bar for the example
           plots = F) 

Getting additional statistics

Additionnal statistics may be computed with additional.stats = TRUE.

fit <- jrt(data,
           additional.stats = T,
  = F,
           plots = F) #removing the progress bar for the example

Using the fitted object

The fitted model is stored in the slot @mirt.object, so additionnal functions from mirt can be easily used.

For example:

# Get more fit indices and compare models
mirt::anova(fit@mirt.object, verbose = F)
# Get total information for a given vector of attributes
mirt::testinfo(fit@mirt.object, Theta = seq(from = -3, to = 3, by = 1))
# Get the test information for case 1
mirt::testinfo(fit@mirt.object, Theta = fit@factor.scores.vector[1])
# Get marginal reliability for high abilities – using a Normal(1,1) prior
                   density = function(x) {dnorm(x, mean = 1, sd = 1)})

Comparing two models with Likelihood Ratio Tests

For now, direct comparisons between two models are not directly implemented, but rather easy to do with mirt's anova() function, applied on the @mirt.object from two fitted models.

model1 <- jrt(data, "GRM", silent = T) # Fitting a GRM
model2 <- jrt(data, "CGRM", silent = T) # Fitting a Constrained GRM
mirt::anova(model1@mirt.object, model2@mirt.object, verbose = F) #Comparing them

Dealing with missing data

The ratings_missing data is a simulated dataset with a planned missingness design. jrt will be default impute missing data for partially missing data, but can be easily retrieved.

fit <- jrt(ratings_missing, irt.model = "PCM", silent = T) #fit model

The contains both the original data and the data with imputation (variable names are tagged "original"" and "imputed"), as well as the factor scores.

To retrieved them separately, the imputed data can be retrieved with, the original data is in and the factor scores can be retrieved like described previously.

Just using jrt for plotting?

You may want to use jrt as a plotting device only. That's ok, because jrt plotting functions will accept mirt objects as input. They should be detected automatically as such (unidimensional models only).

Let's fit a Generalized Partial Credit Model with mirt for this example.

fit <- mirt::mirt(data = mirt::Science, 
                  model = 1, 
                  itemtype = "gpcm",
                  verbose = F)

Now jcc.plot() can plot the category curves. Note that the default column names is now automatically switched to "Item".


For the information plot:


For convenience the argument item can be used instead of judge in both plotting functions:

jcc.plot(fit, item = 3)

Even though it isn't its primary purpose, jrt can also plot binary item response functions. They will be automatically detected and the plot will be named accordingly.

# SAT data from mirt
## Convert to binary
data <- mirt::key2binary(mirt::SAT12,
    key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
## Fit 2PL model in mirt
fit <- mirt::mirt(data = data, model = 1, itemtype = "2PL", verbose = F)
## Plotting an item response function
jcc.plot(fit, item = 2)
## Plotting the item response functions of the first 12 items with a larger theta range
jcc.plot(fit, facet.cols = 4, item = 1:12, theta.span = 5)

Try the jrt package in your browser

Any scripts or data that you put into this service are public.

jrt documentation built on April 14, 2023, 12:35 a.m.