knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(birtms)
library(magrittr)

Abstract

This R package birtms combines the approach on flexible item response theory (IRT) modeling as seen in flirt with the interface provided by brms to follow a bayesian modeling approach. In this vignette you will learn how to use the birtms package to specify a variety of IRT models. In a following vignette you will be introduced in the possibilities to evaluate your IRT models with birtms.

Introduction

Since this package builds on brms as an interface to specify and "fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan'^[The engine doing the hard work of Bayesian modeling is Stan and thus the default sampler is the no-U-turn sampler an extension of Hamiltonian Monte Carlo alogirthm. For an overview about differing samplers and some very enlightning animations showing the benefits of the no-U-turn sampler have a look on McElreath blog entry.]" [@R-brms] it is limited to the models you can specify with brms and Stan itself. Therefore i.e. mixed Rasch models can't be specified right now, since Stan can't handle discrete parameters wihtout some tricks.^[Have a look on McElreath blog entry about how to use discrete parameters ins Stan. Unfortunately this strategy can't be used with brms so would have to learn about specifying your model directly e.g. with rstan all by yourself.] Nevertheless all models that could be fitted in flirt can be fitted via brms as well and in the long term they will be available in birtms. Note that currently only dichotomous response variables can be modeled. Nevertheless additionally to the models available in flirt you can also fit three an four parametric models with birtms right now. In the far future there might even be the possibility to specify non-compensatory multidimensional IRT models. I hope to make a contribution to a wider usage of Bayesian IRT modeling among e.g. educational researchers as the package edstan [@edstan] did with it preset models for rstan and exhaustive tutorials for modification and evaluation of these models but in a more flexible and intuitive manner.

Be aware that since a Bayesian model will always give you an answer to your question it will take a lot longer than inference statistical methods (like ML estimators) and the answer might be parameter estimations with huge credibilty intervals. So models with many parameters (e.g. 3 parametric multidimensional testlet model for 100 persons and 270 items in 36 testlets which results in 27000 observations and 4510 parameters) will have a long run time (up two days). And due to the small sample size of 100 persons the resulting slope and pseudo guess parameters will have huge credibility intervals - this means you definitly should not proceed with your analysis just with the median or modal value for these parameters (as you would do in non-bayesian statisics).^[In this case a look on the leave one out cross validation criteria should show you a bunch of high pareto k's giving you a hint that your model probably is overfitting the data.] Also be aware your question could be poorly specified which will result in an useless answer regardless statistical foundation of your method.

Preparation for fitting IRT models with birtms

The following structure is closely oreintated on the article on the flirt package by @Jeon.2016 (which can be found here).

Installation

Data set-up

response_data <- tibble::tribble(
  ~ID, ~i01, ~i02, ~i03, ~gender,
  'ma01', 1, 0, 1, 'm',
  'ho02', 1, 1, 0, 'm',
  'mg04', 1, 1, 0, 'w',
  'na11', 0, 0, 1, 'm',
  'ma02', 0, 1, 1, 'w',
)

person_data <- tibble::tribble(
  ~ID, ~IQ,
  'ma01', 100, 
  'ho02', 115, 
  'mg04', 130, 
  'na11', 85, 
)

item_data <- tibble::tribble(
  ~item, ~pictures, ~color,
  'i01', 1, 'blue',
  'i02', 0, 'red',
  'i03', 2, 'green',
)

situation_data <- tibble::tribble(
     ~ID, ~item, ~color,
  "ma01", "i01",        'blue',
  "ma01", "i02",        'red',
  "ma01", "i03",        'green',
  "ho02", "i01",        'blue',
  "ho02", "i02",        'green',
  "ho02", "i03",        'red',
  "mg04", "i01",        'red',
  "mg04", "i02",        'blue',
  "mg04", "i03",        'green',
  "na11", "i01",        'green',
  "na11", "i02",        'red',
  "na11", "i03",        'blue',
  "ma02", "i01",        'red',
  "ma02", "i02",        'green',
  "ma02", "i03",        'blue'
  )

The data has to be in long format. To get the data from wide into long format and to add covariables (related to persons or items) you can use the birtms::compose_dataset function. (If it already is in long format you can skip this part and continue with the chapter about Model specification.) The following data is in wide format and contains the response data for five persons identified by their ID for three fictive items i01, i02, i03 and a person specivic covariable gender:

response_data

To convert the response data into long format call

var_specs1 = list(person = 'ID', response = 'response')
response_data_long1 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs1)
head(response_data_long1)

Behind the scenes

So far so good. Let's have a look behind the scences. This basic usage or birtms::compose_dataset does similiar things as a call to

tidyr::pivot_longer(data = response_data,
                    cols = i01:i03,
                    names_to = 'item',
                    values_to = var_specs1$response) %>% head()

But have you noticed the difference? With our call to birtms::compose_dataset the variable gender gets dropped. Is this a bug or a feature? In point of view it is a feature that helps you checking your model specification due to the variable_specifications argument you will learn about in detail later on. So birtms::compose_dataset drops any variable that isn't mentioned in the list passed to this argument.

Include covariables

So to keep the covariable gender we have to use it in our model specification. Let's say we want to use it as a predictor term. How can we specifiy this? Just like this:

var_specs2 = list(person = 'ID', response = 'response',
                 person_covariables_main_effect = 'gender')
response_data_long2 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs2)
head(response_data_long2)

Pitfalls (Behind the scenes, 2nd part)

Well, person_covariables_main_effect is a realy long term. Can we just use some abbreviation or dummy term?

var_specs3 = list(person = 'ID', response = 'response',
                 person_covariables = 'gender')
response_data_long3 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs3)

No we can't. The function checks our list against a preset of allowed names so you don't suffer from typos.^[These checks are performed for any of these functions: birtms::compose_dataset, birtms::build_formula, birtms::birtm and birtms::birt_aio] You may also have noted that there are additional terms beginning with person_covariables_. You will learn about their different meaning later on.

Okay. But what if we want to rename the column of person identificators just to person? Can we just run?:

var_specs4 = list(person = 'person', response = 'response',
                 person_covariables_main_effect = 'gender')
response_data_long4 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs4)

Nope. Is there another possibility to convert the names on the flow? No. Come on! Just use the dplyr::rename before handing over your dataset. It's not too hard, is it?

Alter the 'item' column name

Attention: At the moment only the person column name can be handeled flexible. All evaluation functions do require the item column name in this version of birtms.

What do you get from this example? The function checks for the existence of all columns in your input dataset but for the names response and item. So what about this names? You can choose them as you wish. (I know it's kinda inconsistent.) But wait! we never told the birtms::compose_dataset about the name item. Where did it came from? How can we alter is? We can alter it as folows:

var_specs5 = list(person = 'ID', response = 'answer', item = 'task')
response_data_long5 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs5)
head(response_data_long5)

Notice that the column names now are ID, answer, task instead of ID, response, item. But where came the term item from in the first place?

The lazy way

There are some predefinitions for the variable_specifications list that get replaced by your arguments. The prespecified list is: list(response = 'response', item ='item', person = 'person'). This implies that you can skip these arguments for response and item. For person this is only possible if you have a column named person in your dataset that holds unique person identifiers.^[Otherwise your model will do somethind realy odd. Imagine person holds (for some reason) their school name. Then your schools will bea treated as subjects that solved all items multiple times. I think this model would crash bacause the generated observations are not unique but even if it won't (let me know if you tried this out): how the fuck should this model tell us something of interest about the persons who took the test?!]

response_data2 <- response_data %>% dplyr::rename(person = ID)
response_data_long6 <- birtms::compose_dataset(response_data = response_data2,
                                              response_columns = i01:i03)
head(response_data_long6)

For you to solve: How can we keep the gender column in a lazy way?

Include covaribles extended

person covariables

All right. By now you should be able to shift your response data from wide format into long format and include person specific covariables that were included in the response dataset. But wait? What about item specific covariables? Or covariables that are specific for any person item interaction (which I call situation covariables)? Let's have a look on the three arguments of birtms::compose_dataset we haven't touched yet: person_data = NULL, item_data = NULL, situation_data = NULL.

First things first. What might this person_data argument for? Well, if you have person specific covariables that aren't part of your response dataset yet you can pass them here and the function will use dplyr::left_join to stick those two together by the person identifier column. You could do this by yourself of course. It's just implemented for convenience. Have an example before we continue:

person_data
var_specs7 = list(person = 'ID', response = 'response', item ='item',
                 person_covariables_main_effect = c('gender', 'IQ'))
response_data_long7 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs7,
                                              person_data = person_data)
response_data_long7

Notice that you have to use IQ somewhere in your model. Because we already know the person_covariables_main_effect term, we include IQ as a second person covariable by assigning a character vector instead of a string (a character vector of lenght one). Notice that you mustn't assign vecors of length greater than one to the other terms we came along so far. So this is a new feature but we will see it again and again later on. Notice also that we find some NA entries in the final dataset. This is because there were no IQ values present for ma02.

Attention: This will cause dropping all these observations if we specify a model taht uses IQ as a predictor from ma02 if we don't handle these missing values with brms::mi(). This is not possible inside birtms right now. But you might be able to customise the model formula by hand fit the model and convert it into a birtmsfit object to continue the evaluation with birtms. Unfortunately this is not possible in the current version of the package. So maybe you can work around with imputation as explained by Paul Bürkner - author of the brms package - in this vignette.

item covariables {#itemcovariables}

For item specific covariables you have to add a dataset with a column holding item identifiers that correspond to the former item response column names and further columns holding the covariables. The name of the item identifier column must correspond to the name of the item identifier column you chose in variable_specifications (or item if you use the default value). In this form you simply pass it to the item_data argument. Again: be aware for missing values!

item_data
var_specs8 = list(person = 'ID', response = 'response', item ='item',
                 item_covariables_intercept = c('pictures', 'color'))
response_data_long8 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs8,
                                              item_data = item_data)
response_data_long8

For this example we can imagine that the items contained a different number of pictures and the background may have varied as well. Research questions could be related to multi media theory or the psychologic impact of colors on performance. Of course the way it is represented there would be way to less variation in the item properties to call it an experiment. Imagine there is much more data especially more persons working on the items.

situation covariables

The last kind of covariables you can specify are situation or person*item covariables. The dataset to pass here is already in pretty long format. Imaginge, we vary the background color by random in an online test.

situation_data
var_specs9 = list(person = 'ID', response = 'response', item ='item',
                 situation_covariables = 'color')
response_data_long9 <- birtms::compose_dataset(response_data = response_data,
                                              response_columns = i01:i03,
                                              variable_specifications = var_specs9,
                                              situation_data = situation_data)
response_data_long9

Closely compare the columns color in response_data_long8 and response_data_long9 to see the difference:

response_data_long9 %>% dplyr::rename(color.situation = color) %>% 
  dplyr::mutate(color.item = response_data_long8$color) %>% 
  DT::datatable()

Other examples for situation covariables might be: the item ordering in a randomized test or the fact that a special forces team accidently entered the class room (probably influencing the task performance while they are present and even when they are long gone).

This is all about shifting and composing your data into the required data format for usage in birtms with birtms::compose_dataset. You have learend how all the optional arguments work and how to use them to combine all your data in one dataset. Next we will have a look at the first heart of this package: birtms::build_formula.

Model specification {#modelspecification}

This package uses an approch to specify IRT models that tries to avoid either the specifications of design matrixes (see TAM [@TAM_3.5-19] for an inference statistical R package), some kind of generalized (non-)linear formuale (see brms [@R-brms], rethinking [@rethinking]) or build a model up from scratch (see rstan [@rstan]) but with a list of key words. I hope this helps people to specify their models more easily even if they don't get all the math behind generalised (non-)linear models (further gnlms) or programming skills . Compared to the edstan approach using basic but ready rstan models I hope this approach to be more flexible and intuitive.

On the other hand I am aware that it might get a bit messy to specify a realy complex model in the key word way because you have to get to know the right key words instead of understanding some basic principles. Therefore I hope to make this vignette as easy to understand and comprehensive as possible to guide you to the model you have in mind. Also I recommend to make a step further and have a look at the gnlms generated by birtms::build_formula to get in touch with this approach accompaning the great vignetts of Paul Bürkner [i.e. regarding IRT modeling, @burkner2020bayesian] and @McElreath.2020 (for a in depth introduction to Bayesian modeling using gerneralized linera models). The latter got a portation from the accompaning package rethinking to brms by [@kurzStatisticalRethinkingSecondEd2020]. The next step would be to specify models directly in rstan. The function brms::make_stancode can generate the rstan model from a formula attained by birtms::build_formula or written by yourself via brms::bf (lazy version of brms::brmsformula).

1PL model

To get the formula for the simplest IRT model (a one-dimensional one-parametric model without any covariables or hierarchies for dichotomous data) you simply call:

formula_1PL_1dim <- birtms::build_formula()
formula_1PL_1dim

Notice that this is a linear formula where 1 represents the intercept, (1 | person) will give you pooled skill estimators for every person and (1 | item) will give you pooled easyness estimators for every item (scince it's linked via + and not - as you often find it when approaching directly to IRT)^[So to get difficulty estimators you have to multiply the easyness estimators by -1.]. Here I follow the recommondation found in Paul Bürkners above mentioned vignette [-@burkner2020bayesian] to pool the item estimators as well as the person estimators since this stabilizes the modeling later. This also punishes very extraordenary items pulling their estimators a bit to the mean. To describe this in depth is out of the scope of this vignette.

Of course the call above was the lazy version. To adjust the term names in the formula use the variable_specifications argument:

var_specs5 = list(person = 'ID', response = 'answer', item = 'task')
birtms::build_formula(variable_specifications = var_specs5)

This formular corresponds to the expression

$$\begin{equation} \tag{1} P(y_{ik} = 1 | \theta, \beta) = \frac{\exp(\theta_i + \beta_k)}{1 + \exp(\theta_i + \beta_k)} \end{equation}$$

where $\theta_i$ is the skill estimator of person $i$, $\beta_k$ is the easyness estimator of item $k$ and $y_{ik}$ is the answser given by person $i$ on item $k$. This is equivalent to

$$\begin{equation} \tag{2} \mathrm{logit}(P(y_{ik} = 1 | \theta, \beta)) = \theta_i + \beta_k \end{equation}$$

So $\theta$ is related to person or ID and $\beta$ is related to item or task in our formulae above. I think it is important to note that these $\beta$ here neither represent regression coefficents (that will be modeled for covariables later on) nor slope parameters.^[In [@McElreath.2020] the $\beta$s are used for regression coefficents and in some IRT packages $\beta$ (or just $b$) is used for the slopes parameters instead of $\alpha$ (or $a$) as I will use them later.] Here $\theta$ and $\beta$ (notice that they have no subscript) should represent vectors containing all individual $\theta_i$ / $\beta_k$. I wrote $P(y_{ik} = 1 | \theta, \beta)$ since both - the $\theta$ and the $\beta$ - are modeled here and I don't treat them as something fixed / god given and thus my notation is different from the notation in the article of @Jeon.2016.

2PL model

If you wish to model a model with more than one parameter per item the formula becomes nonlinear. This will slow down the process a bit because Stan has to do more operations in each iteration. But the main reason it takes longer to fit a model with more item parameters is because there will be more parameter sets that won't be accepted as a next step because there are more possibilities for some parameters to be far of a reasonable value which results in a worse likelihood. Nevertheless generating the corresponding formula is pretty easy and will bring up the second (and last) argument for birtms::build_formula:

var_specs10 = list(person = 'ID', response = 'response', item ='item') # this are the preset values and therfore could be left out
model_specs = list(item_parameter_number = 2)
formula_2PL_1dim <- birtms::build_formula(variable_specifications = var_specs10,
                      model_specifications = model_specs)
formula_2PL_1dim

This corresponds to the following formula

$$\begin{equation} \tag{3} \mathrm{logit}(P(y_{ik} = 1 | \theta, \beta, \alpha)) = \alpha_k \theta_i + \beta_k \end{equation}$$

Here $\alpha_k$ represents the slope or disrimination parameter estimators. Notice that the model draws values for logalpha ~ 1 + (1 | item) and not alpha directly to help with model identification by restricting all the alpha = exp(logalpha) to be positive. Therefore you won't see any negative slopes in your model. Instead you might find more alphas close to zero indicating that the data supposes these items don't fit in the model you want to force them in. Check if you have to revert the evaluation or think about dropping these items from the further evaluation and editing the items before a future usage (if you can find a hint what the problem might be, e.g. misleading item stem or distractors).^[It is possible to draw alpha directly if you find a proper (narrow) prior. You will learn a bit on priors at the chapter on Model fiting but as these are so important to Bayesian statistics I recommand have an in depth view into some articles or books about Bayesian statistics before fiddling around blindly. And let me ask you this question: aren't you going to inspect carefully any item with low slope parameter anyway as you would do for negative parameters? So what would be your big gain?]

3PL model

We can use the same tool to fit a three parametric model (notice that we left out the variable_specifications argument since we are using the preset values in this example):

model_specs = list(item_parameter_number = 3)
formula_3PL_1dim <- birtms::build_formula(model_specifications = model_specs)
formula_3PL_1dim
formula_3PL_1dim$family

This corresponds to the following formula

$$\begin{equation} \tag{4} P(y_{ik} = 1 | \theta, \beta, \alpha, \gamma)) = \gamma + (1 - \gamma) \frac{\exp(\theta_i + \beta_k)}{1 + \exp(\theta_i + \beta_k)} \end{equation}$$

Notice that $\gamma$ has no subscript. This means that a single pseudo guessing parameter is estimated unless something different is specified as you can see below. I believe this is a good default behavior because this way all items can be compared regarding their easyness and slope parameters [@Han.2012] and a common estimator probably fits better than a common value of zero (even this common estimator is not the 'real' value). Also this way there is more data the estimation is based on. Be aware that in general the estimation of item specific pseudo guessing parameters needs a lot of data to become precise scince you need many responses from people with very low proficiency.

Notice the inverse logit transformation of the logitgamma parameter. Here we follow the approach of @burkner2020bayesian to ensure that the 'gamma' parameter will be between 0 an 1. Additionally the link function chosen in the family argument of the formula will change from logit to identity because we aready use the inv_logit transformation in the main formula part.

To force the model to estimate a pseudo guessing parameter for each item (or some item groups, i.e. for testlets) you have to state this as follows:

model_specs = list(item_parameter_number = 3)

# pseudo guessing parameter per item
var_specs11 = list(item = 'item', pseudo_guess_dimension = 'item') # item argument could be left out
formula_3PL_1dim <- birtms::build_formula(variable_specifications = var_specs11, 
                                          model_specifications = model_specs)
formula_3PL_1dim

# pseudo guessing parameter per item group (here testlet)
var_specs12 = list(item = 'item', pseudo_guess_dimension = 'testlet') # item argument could be left out
formula_3PL_1dim <- birtms::build_formula(variable_specifications = var_specs12, 
                                          model_specifications = model_specs)
formula_3PL_1dim

This corresponds to following fomulae:

$$\begin{align} \tag{5.a} P(y_{ik} = 1 | \theta, \beta, \alpha, \gamma)) &= \gamma_k + (1 - \gamma_k) \frac{\exp(\theta_i + \beta_k)}{1 + \exp(\theta_i + \beta_k)} \ \tag{5.b} P(y_{ik} = 1 | \theta, \beta, \alpha, \gamma)) &= \gamma_{testlet} + (1 - \gamma_{testlet}) \frac{\exp(\theta_i + \beta_k)}{1 + \exp(\theta_i + \beta_k)} \end{align}$$

For select-1-from-k multiple choice test formats an alternative and common approach is to pass fixed guessing parameters of $\frac{1}{k}$. You can do this by using the argument fixed_pseudo_guess instead of pseudo_guess_dimension. If you have a single value for all items you can pass it as a single numeric. If you want to use different values for different items (i.e. if you have items with differing numbers of choices to choose from) you can pass the name of a column which holds these values (as probabilities from 0.0 to 1.0). In the latter case you have to pass these values via the item_data argument (see item covariables).

model_specs = list(item_parameter_number = 3)

# common fixed pseudo guessing parameter for all items
var_specs13 = list(fixed_pseudo_guess = .25)
formula_3PL_1dim <- birtms::build_formula(variable_specifications = var_specs13, 
                                          model_specifications = model_specs)
formula_3PL_1dim

# fixed pseudo guessing parameter per item
var_specs14 = list(fixed_pseudo_guess = 'guess')
formula_3PL_1dim <- birtms::build_formula(variable_specifications = var_specs14, 
                                          model_specifications = model_specs)
formula_3PL_1dim
attributes(formula_3PL_1dim$pforms$gamma)$nl

Notice that in the second model the nl attribute for the gamma formula was set TRUE so these values are taken as they are and no regression weight is estimated here. (In the first model this formula does not exist because the values are plugged into the main formula directly.)

Be aware that using fixed_pseudo_guess and pseudo_guess_dimension at once does not work. In this case the pseudo_guess_dimension will be ignored:

model_specs = list(item_parameter_number = 3)

# common fixed pseudo guessing parameter for all items
var_specs15 = list(fixed_pseudo_guess = .25, pseudo_guess_dimension = 'item')
formula_3PL_1dim <- birtms::build_formula(variable_specifications = var_specs15, 
                                          model_specifications = model_specs)
formula_3PL_1dim

# fixed pseudo guessing parameter per item
var_specs16 = list(fixed_pseudo_guess = 'guess', pseudo_guess_dimension = 'item')
formula_3PL_1dim <- birtms::build_formula(variable_specifications = var_specs16, 
                                          model_specifications = model_specs)
formula_3PL_1dim

4PL model

We can even fit a four parametric model:

model_specs = list(item_parameter_number = 4)
formula_4PL_1dim <- birtms::build_formula(model_specifications = model_specs)
formula_4PL_1dim

This corresponds to following fomula:

$$\begin{equation} \tag{6} P(y_{ik} = 1 | \theta, \beta, \alpha, \gamma, \psi)) = \gamma + (1 - \psi - \gamma) \frac{\exp(\theta_i + \beta_k)}{1 + \exp(\theta_i + \beta_k)} \end{equation}$$

Notice that in many texts you might find that the fourth parameter is the upper asymptote parameter (lets call this $\Psi$) in the form $\Psi = 1 - \psi$. Honestly I don't know if this will make some difference in the modeling approach beyond the need to adjust the priors mean. I just followed the model specification shown in @Burkner.2020. Again this parameter has to be between 0 and 1 so we use a inv_logit transformation here.

The arguments used to specify the 4PL parameters work analogous to those of the 3PL models. Just replace pseudo_guess by careless_error. Be aware that a 4PL model can have severe identification issues to overcome with narrow priors or a huge amount of data per item. Following are some examples how to use the 4PL arguments along with the 3PL arguments:

model_specs = list(item_parameter_number = 4)

# careless error parameter per item
var_specs16 = list(careless_error_dimension = 'item')
formula_4PL_1dim <- birtms::build_formula(variable_specifications = var_specs16,
                                          model_specifications = model_specs)
formula_4PL_1dim

# common fixed careless error parameter for all items
var_specs17 = list(fixed_careless_error = .05)
formula_4PL_1dim <- birtms::build_formula(variable_specifications = var_specs17,
                                          model_specifications = model_specs)
formula_4PL_1dim

# fixed careless error parameters per item
var_specs18 = list(fixed_careless_error = 'error')
formula_4PL_1dim <- birtms::build_formula(variable_specifications = var_specs18,
                                          model_specifications = model_specs)
formula_4PL_1dim

# common fixed pseudo guess and careless error parameter for all items
var_specs19 = list(fixed_pseudo_guess = .25, fixed_careless_error = .05)
formula_4PL_1dim <- birtms::build_formula(variable_specifications = var_specs19,
                                          model_specifications = model_specs)
formula_4PL_1dim

Multidimensional models

With birtms you can specify multidimensional IRT models. There are two approaches to specify a multidimensional model in birtms you can comebine flexible as it pleases you. The two approaches are called regular dimensions and unregular dimensions in this package and are described below. Within a multidimensional model you specify multiple (related) latent traits that are needed to solve the presented items. Therefore there will be multiple estimates for the persons abilities $\theta_{di}$ and eventually also multiple itemslopes $\alpha_{dk}$ where $d$ is the number of dimensions. The formula for the two parametric model therefore becomes:

$$\begin{equation} \tag{7} \mathrm{logit}(P(y_{ik} = 1 | \theta, \beta, \alpha)) = \sum{\alpha_{dk} \theta_{di}} + \beta_k \end{equation}$$

To choose if you should specify the item-ability-structure with a regular or unreagular dimension approach you have to look at what is might be called a loading matrix (and corresponds to the Q matrix in TAM). If you assume that there are two (or more) abilities that are related to the items so that any item only relates to a single ability and that any item relates to one of these abilities (e.g. in formula 8.a) you can use the regular_dimensions argmuent to model this. You can use the unregular_dimensions argmuent as well if you prefer this approach.

$$\begin{align} \tag{8.a} Q_1 & = \left(\begin{array}{ccc} \alpha_{11} & 0 \ \alpha_{12} & 0 \ 0 & \alpha_{23} \ 0 & \alpha_{24} \end{array} \right) \end{align}$$

If you assume that there are some items that are related to a set of abilities a way that the intersect of the loading vectors is not empty (in formula 8.b you find item 3 related to dimension 1 and 2) you have to use the unregular_dimensions argmuent. You can't use the regular_dimensions argmuent to model the whole structure in this case.

$$\begin{align} \tag{8.b} Q_2 & = \left(\begin{array}{ccc} \alpha_{11} & 0 \ \alpha_{12} & 0 \ \alpha_{13} & \alpha_{23} \ 0 & \alpha_{24} \end{array} \right)\ \end{align}$$

However if you find a subset of dimensions suiting the approach in formula 8.a you can use the regular_dimensions argmuent for these and use the unregular_dimensions argmuent to add the other item-ability-terms. In formula 8.c you could model the first two loading vectors with the regular approach and add the last vector using the unregular approach.

$$\begin{align} \tag{8.c} Q_1 & = \left(\begin{array}{ccc} \alpha_{11} & 0 & \alpha_{31} \ \alpha_{12} & 0 & 0 \ 0 & \alpha_{23} & 0 \ 0 & \alpha_{24} & \alpha_{34} \end{array} \right)\ \end{align}$$

From a modeling efficiency view point there are no big difference between these two approaches^[However a one parametric model will be faster if there are no unregular terms because it can be specified a a linear model and hence there will be less operations to be performed by the algorithm. If you specify a higher-parametric model by yourself you might find the regular approach to be faster as well. If you do check if you model a single distribution for the item slopes or if you model a hierarchical distribution for any dimension. If you use a single distribution the model will befaster but your model can't reflect possible differences in the itemslopes (variances) as well. Therefore birtms does account for different itemslope distributions per dimension as default so it aligns with the unregular approach. Have a look at my post at the Stan forum. You can restrict the model (prevent this hierarchical modeling) by setting the model_unique_alpha_groups_on_regular_dimensions value in the model_specifications argument to FALSE.]. But the regular approach can be expressed in a single column in the item_data dataframe and I find the estimated parameters (as presented via brms::summary) can be interpreted more intuitively. On the othe hand the unregular approch can be used to model any item-ability-structure and if you are used to the loading matrices from TAM you might prefer this way. Be aware that this way there will be estimators for all item slopes even those that have a 0 in the laoding matrix. You can identify them by being equivalent to the prior distribution you specified for them and hence should have a bigger sd and a mean that is equal to your priors mean (e.g. zero when you assumed to find item slopes around one (remeber the exp() transformation)).

Regular Dimensions {#regulardimensions}

item_data_dimensions <- tibble::tribble(
  ~item, ~inquiry, ~math, ~mentalrotation, ~hypothesis, ~experiment,
  'i01', 'Hypothesis', 1, 0, 1, 0,
  'i02', 'Hypothesis', 1, 1, 1, 0,
  'i03', 'Experiment', 0, 1, 0, 1,
  'i04', 'Experiment', 1, 0, 0, 1
)

Let's assume you have created a science test and your items differ in the abilities needed to solve them. For example some of these items could ask for the formulation of a hypothesis and some require to come up with a experimental design (but never both at the same time and at least one of these things). You can find an example how to set up the corresponding item_data below:

item_data_dimensions %>% dplyr::select(item, inquiry)

Colmuntype: Integer?, Character, Factor

You get the formula for a model that estimates an ability for each of those two task with the following call:

var_specs20 = list(regular_dimensions = 'inquiry')
model_specs1pl = list(item_parameter_number = 1)
build_formula(variable_specifications = var_specs20, model_specifications = model_specs1pl)

The corresponding formula for a two-parametric model would be:

model_specs2pl = list(item_parameter_number = 2)
build_formula(variable_specifications = var_specs20, model_specifications = model_specs2pl)

Notice the logalpha1 ~ 1 + (1 | inquiry/item) term that

Unregular Dimensions {#unregulardimensions}

You could fit the same structure as shown in the part about regular dimensions with unregular dimensions as well. But be aware that you have to use a differing presentation of your structur in item_data. Instead of using a single column that names the corresponding dimension you have to create a single column per dimension and fill it with 1 if the item is related to that dimension and 0 otherwise:

item_data_dimensions %>% dplyr::select(item, inquiry, hypothesis, experiment)

With these new columns you can get a one-parametric formula via:

var_specs21 = list(unregular_dimensions = c('hypothesis', 'experiment'))
build_formula(variable_specifications = var_specs21, model_specifications = model_specs1pl)

Notice that this formula isn't linear anymore since the columns hypothesis and experiment are used as mapping vectores. Their values (0 or 1) will be multiplied as they are (without getting an additional regression coefficient) by the ability estimates (theta1 and theta2) because this multiplication is stated in a non-linear formula^[In a linear formula we would get estimated an interaction effect and hence regression coefficients.]. Notice that these mapping vectors are standing in the same spot as the item slopes do in the two-parametric formulae we have seen so far. But they are no latent variables that get estimated by the model but predictors with fixed (by yourself) values. To stress this difference let's have al look at a formula for a two-parametric model:

build_formula(variable_specifications = var_specs21, model_specifications = model_specs2pl)

The mapping vectors are indeed multiplied addionally to the slope estimates not instead of.

Worse elpd because more parameter are estimated than used later on

Common Dimension and Bifactor models

Model fitting {#modelfitting}

Now that you learned how to set up a dataset for usage with birtms and how to get the formula that describes your model in mind we come to the step you probably hungered for all the time; fitting the model. Here the birtms::birtm functions comes into play^[The messengers behind the scenes are brms::brm rstan::stan who transform the formula and data into proper Stan code (which will be C++ code for best performance).]. It takes your data and formula and passes all along the long road to the modeling algorithms implemented in Stan. It also lures for the variable_specifications and model_specifications you used to generate the formula so it can save these parameters for usage by all the evaluation functions coming with this package^[It actually was one of my main reasons to create this framework so my evaluation functions can handle all the parameters the right way and show us the outputs in the right format.]. birtms::birt has the following arguments we will discuss below.

birtm(data, formula,
      variable_specifications = NULL, model_specifications = NULL,
      prior = NULL, cores = 4, file = NULL, refit = FALSE,
      check_prior_influence = FALSE, preparing_bf = FALSE,
      add_waic = TRUE, add_loo = TRUE, ...)

The data argument expects you to pass a dataset in long format (i.e. composed via birtms::compose_dataset). Analogly the formula argument expects a formula of your model (i.e. build via birtms::build_formula). You fin the details about those two functions above. The arguments cores, file and prior are passed to brms::brm. The file argument states where the model should be saved on your hard disk (so you don't have to fit it again next session). If you want to replace this file with a new model fit you can delete the old file manually or set the refit argument TRUE. The core argument sets on how mny cores the MCMC chains are running parallel. Default are 4 chains so more than 4 cores don't make sense in this case. So decreasing the core number will increase the run time but decrease the performance drain on your system. But most important is the argument prior since the specification of prior beliefs strikes the heart of Bayesian modeling. Unfortuanetly it's out of this articles scope to cover this topic^[Have a look at the brms documentation on how to specify priors. Have a look at @burkner2020bayesian for some first examples for light informative priors to use for the partly pooled item and person estimators. Have a deeper look at @McElreath.2020 or @Kruschke.2014 to get a feeling what priors could be placed on linear predictors and how to state them approriate to your prior beliefs strength.]. But let me give you this rule of thumb: Never Addionally arguments can be passed to brms::brm via the ... argument. For details see the brms documentation [@R-brms].

The check_prior_influence argument has to be boolean and will be used to specify if the specified priors are sampled as well^[It set the sample_prior argument in brms::brm.]. This can be used to check how strong the influence of your priors is compared to the influence of the provided data via a graphical comparism afterwards. Setting this argument to TRUE will increase the size of the fitted model object slightly (generally under a few percent). Analogly the preparing_bf argument is a boolean that is related to the save_pars argument of brms::brm. If you want to persorm tests based on the Bayes factor afterwards you have to set this TRUE. Be aware that this will double the size of the model fit! Therefore arguments sample_prior and save_pars should not be passed to the ... argument.

The last arguments to cover are add_waic and add_loo which are boolean and will effect if these criteria for model comparism and identification of overfitting issues will be calculated. Their defaul value is TRUE since they are computed relatively fast (compared to the modeling time) and their results are realy informative and don't need much disk space. Be aware: Since IRT models are always hierarchical regarding the observations (there are multiple answers related to each person since they answer multiple items) it might be more appropriate to use a grouped k-fold cross-validation criteria instead of PSIS-LOO^[The computation of a 10-fold grouped (by person) LOO will take about ten times to model the full dataset. If you want to just leave out one persons responses at a time you fast get to a computation time multiplied by 100 or 1000 (depending on your sample size). You can do this with brms::kfold using the group and folds argument [@R-brms].The only study exploring how big the differences between PSIS-LOO (or exact LOO) and grouped k-fold LOOs might be is this case study by Vehtari. It finds, that the elpd vales may differ quite a lot between the methods but that the order of the compared models seems to be the same. So to just compare which model fits better PSIS-LOO seems to give us the same answers. But keep in mind that this study is not IRT specific. In my work I found that high pareto k values (which indicate some overfitting issues) mostly are aggregated around specific items not for persons. Notice further that the computation of exact LOO probably will take forever since we often have so many observations (persons times items) and might want to model complex models that take more than an hour to run a single time. So exact LOO seems to be undesireable for two reasons: its computation takes very very long and it does not reflect the hierarchical structure of our datasets.]. But most of the times these criteria will give us reliable information about which of our models^[There might be better models we have not modeled yet and there probably is no "right" model.] fits the data best and signaling possible over fitting issues to us.

Done. We covered all the arguments of birtms::birtm. So finally we can see it in action. We will use the dataset on the SPM-LS dataset from @Myszkowski.2018 as did @Burkner.2020 but for a faster model fitting we will only take the responses of the first 100 persons. Below you will see the whole process of fitting a model with birtms. First we willshift the response data from wide into long. Second we will get a forumal for a one-dimensional one-parametric IRT model (the easiest one). Finally we will fit the model and inspect its summary.

testdata <- birtms::compose_dataset(response_data = data_spm[1:100,], response_columns = i1:i12)
formula_1d_1pl <- birtms::build_formula()
fit_1d_1pl_spm1 <- birtms::birtm(data = testdata, formula = formula_1d_1pl, 
                                 file = '../inst/extdata/fit_1d_1pl_spm1')
fit_1d_1pl_spm1

Notice that we took advantage of the lazy specification form since the person identifier column already was names person and we did not altered any of the presetting. Notice further that the summary above differs somewhat from the summary of fit_1d_1pl_spm2 (see below) even both models use the same data and same formula. The difference here is a result of the fact that the MCMC chains in both calculations don't take the same path. The pathes are prety similar and converge to very similar values. Hence the difference is much smaller than the estimation error we can't realy say that the parameter distributions are different. Increasing the iteration number probably would decrease any differences.

Next you will see another example to show a more complex model:

var_specs <- list(person_covariables_main_effect = 'gender')
model_specs <- list(item_parameter_number = 2)
person_data <- tibble::tibble(person = 1:100, gender = rbinom(n = 100, size = 1, prob = .5) %>% factor(labels = c('male', 'female')))
testdata <- birtms::compose_dataset(response_data = data_spm[1:100,], response_columns = i1:i12, person_data = person_data, variable_specifications = var_specs)
formula_1d_2pl <- birtms::build_formula(variable_specifications = var_specs, model_specifications = model_specs)
prior_2pl_minimal <- brms::prior("normal(0, 2)", class = "b", nlpar = "skillintercept") +
   brms::prior("constant(0)", class = "b", nlpar = "personcovars", coef = "gendermale") +
  brms::prior("normal(0, 1)", class = "b", nlpar = "logalpha") +
  brms::prior("constant(1)", class = "sd", group = "person", nlpar = "theta")
fit_1d_2pl_spm <- birtms::birtm(data = testdata, formula = formula_1d_2pl, prior = prior_2pl_minimal,
                                variable_specifications = var_specs, model_specifications = model_specs,
                                 file = '../inst/extdata/fit_1d_2pl_spm')
summary(fit_1d_2pl_spm, priors = TRUE)

Notice: since we randomly added the gender variable we don't find any effect from gender on their skill. This totally fits our expectations. Lets finish this section with some notes on the minimal priors: You can't drop one of these without getting corrupted estimators. If you drop the constant(1) on theta there will be a big estimation error on theta_Intercept and logalpha_intercept. SSZ? Dropping the normal(0, 1) on logalpha will result in a mess as well. But you could narrow or widen the prior by decreasing or increasing the SD so it reflects your prior believes strength. Without priors on skillintercept and personcovars you can't start modeling at all. Since gender is a categorial predictor (and since represented by a factor or charachter variable) you have to set one of its levels to constant(0) manually. You don't have to do this for continuous predictors (represented by numeric variables). But you'ld still have to specify a prior for these (e.g. brms::prior("normal(0, 1)", class = "b", nlpar = "personcovars", coef = "gender").

All in one command {#aio}

If you are used to the formulation of IRT models with birtms or when you're going to fit some simple models the only fuction to use might be birtm_aio. It's a wrapper funtion for the former three functions. This way you can fit a model with just one command. Therefore it has the union of all the arguments the former functions had (see below). Again the ... get passed to brms::brm and some presettings for the fitting process differ from those of the brms::brm setting (i.e. the cores = 4). Also notice that adding the WAIC and LOO model fit criteria are enabled by default. Notice that your response_data has to be in wide format if you use birtm_aio.

birtm_aio(response_data, response_columns, 
          person_data = NULL, item_data = NULL, situation_data = NULL,
          variable_specifications = NULL, model_specifications = NULL,
          prior = NULL, cores = 4, file = NULL, refit = FALSE,
          check_prior_influence = FALSE, preparing_bf = FALSE,
          add_waic = FALSE, add_loo = FALSE, ...)

In an easy example you would use it as this:

testdata <- data_spm[1:100,] # take only 1/5th of the data to speed up the examplary model fitting
fit_1d_1pl_spm2 <- birtms::birtm_aio(response_data = testdata, response_columns = i1:i12, 
                                    file = '../inst/extdata/fit_1d_1pl_spm2')
fit_1d_1pl_spm2

Fitting the model the first time might take around one minute. After that time you see the summary of the model created by brms::summary. The file argument saves the model at the specified path. In this case you will find it in the data folder (../ was needed to jump one level up from the vignettes folder) as fit_1d_1pl_spm.rds. Notice that we didn't used any prior specifications here. Again: you should not do this for your own models!

Nevertheless if you run into some errors you can't solve right away it is recommanded to take a step back and follow the three steps of model fitting described before one at a time to locate at which point the error occurs.

Further notes

At some point your models might get rather complex and you might get a warning that there have been divergent chains after warmup or the maximum treedepth was exceeded. In this case you can try to overcome these issus by passing arguments to te ... argument that will be passed down to the Stan engine. For example you can pass control = list(adapt_delta = 0.90) reducing the acceptance rate for a new parameter set. This will make the modeling process slower and is not garantuing to solve your problems. So you should not alter the value by default. For further readings see this.

References



Famondir/birtms documentation built on Feb. 18, 2022, 2:51 a.m.