In duckmayr/gpirt: Tools for a Gaussian Process IRT Model

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

The gpirt package provides an MCMC sampler and related tools for a Gaussian Process Item Response Theoretic (GP-IRT) model. Unlike other IRT models that make strong assumptions about the functional form of the "item response function," or the function mapping respondents' latent trait to their response to items, we place a GP prior over the distribution of functions mapping latent traits to responses, and simultaneously learn these functions and the respondents' latent traits.

For readers unfamiliar with Gaussian processes, we refer them @RW:2006, available here. In what follows, we first detail our MCMC algorithm for sampling the model parameters' posterior distribution. We then walk through a simple but complete example, from data preparation, to prior specification and sampling, and analysis of results. For readers seeking a more detailed presentation of the GPIRT model, please refer to our paper introducing the model [@GPIRTpaper].

Algorithm

We include at the outset, a brief explanation of the notation used throughout:

$n$ is the number of respondents, indexed by $i$.
$m$ is the number of items, indexed by $j$.
$T$ is the number of iterations, indexed by $t$.
$y$ is an $n \times m$ dimension matrix of each respondent's response to each item.
$\theta$ is the $n \times d$ dimension matrix of respondents' latent traits, which we will commonly call ideology (as would be the case in the authors' substantive applications---political science). Currently the algorithm is only implemented for $d = 1$.
$f_j$ is the latent function mapping ideology to responses for item $j$, through a sigmoid function; i.e., $\Pr(y_{ij} = 1) = \sigma(f_j(\theta_i))$.
$f = \lbrace f_{j} \rbrace$. In our code, we refer to the $n \times m$ matrix storing the current draw of $\lbrace f_j(\theta) \rbrace$ as f.

Superscripts denote iterations of the algorithm and subscripts denote respondents/items. For example, $\theta_i^t$ is the value drawn for respondent $i$'s ideology parameter at iteration $t$ of our algorithm.

Priors

We place a standard normal prior over each $\theta_i$; $\theta \sim \mathcal{N}\left( 0, 1\right)$. the variance for all ideology parameter priors is 1, but the prior means may vary. We place a GP prior over the $f_j$. We support a mean function of zero, a linear mean, or a quadratic mean. We use the squared exponential covariance function (also called the radial basis function),

$$ K(\theta, \theta') = \exp\left(-0.5 \| \theta - \theta' \|^2 \right). $$

MCMC Algorithm

Our MCMC algorithm proceeds as follows:

Initialize $f$ and $\theta$: $(f^{0}, \theta^{0})$. a. We allow users to specify initial values for $\theta$; otherwise, $\theta^0$ is drawn from the prior distribution of $\theta$. b. Draw the $f_j$ from their prior distribution, i.e. a normal distribution with mean zero and covariance $K(\theta^0, \theta^0)$.
For $t = 1, ..., T$: a. Use an elliptical slice sampler to draw $f^t$ given $\theta^{t-1}$, $f^{t-1}$, and $y$. For more details on elliptical slice samplers, please consult @ESS, available here. b. Given $f^t$ and $\theta^{t-1}$, draw $f^ = { f_j(\theta^) }$ for $\theta^* = {-5, -4.99, \ldots, 4.99, 5}$ from

\begin{align} f_j^t(\theta^*) \mid \theta^\ast, \theta^{t-1}, f^t, \beta & \sim \mathcal{N}\left(\mathbf{m}^\ast, \mathbf{C}^\ast\right), \ \mathbf{m}^\ast & = \mu(\boldsymbol\theta^\ast) + K(\boldsymbol\theta^\ast, \boldsymbol\theta)\, K(\boldsymbol\theta, \boldsymbol\theta)^{-1} \bigl(\mathbf{f}_i - \mu(\boldsymbol\theta)\bigr), \ \mathbf{C}^\ast &= K(\boldsymbol\theta^\ast, \boldsymbol\theta^\ast) - K(\boldsymbol\theta^\ast, \boldsymbol\theta) K(\boldsymbol\theta, \boldsymbol\theta)^{-1} K(\boldsymbol\theta, \boldsymbol\theta^\ast) \end{align}

(see @RW:2006, equation 2.19). c. Use inverse transform sampling to draw $\theta^t$ by brute-force computation of the conditional posterior on the grid $\theta^*$.

Example

We will use an example from the U.S. Senate. Congressional voting data is available from Voteview [@voteview-data]. gpirt makes available a subset of that data, Senators' votes in the U.S. Senate's first session of the 116th Congress:

library(gpirt)
data("senate116")

The data from Voteview is in a tidy format,^[See tidyr's "Tidy Data" Vignette] but because we want a matrix of responses, we will actually have to make it messy. Here is a tidyverse solution for doing so:

library(dplyr)
library(tidyr)
responses <- senate116 %>%
    ## We only need the vote identifier, member identifier, and vote code:
    select(icpsr, rollnumber, cast_code) %>%
    ## Then we spread the data out:
    spread(rollnumber, cast_code)
## It may be useful to retain the Senators' ID numbers
rownames(responses) <- responses$icpsr
## Then we can remove that column
responses <- responses[ , -1]

With our properly shaped data, we create a response_matrix object, which will ensure the votes are coded in ${\text{NA}, -1, 1}$ and eliminate unanimous items:

responses   <- as.response_matrix(responses)
str(responses)

(Note the messages printed by as.response_matrix() let us know which unanimous items had to be discarded). Now we can run our sampler:

set.seed(1119) # For reproducibility
samples <- gpirtMCMC(responses, sample_iterations = 5000, burn_iterations = 0)
str(samples)

Then we can get estimates of our ideology parameters (by taking the mean of the columns of samples[["theta"]]) without ever having to assume the precise functional form between Senators' ideology and their roll call vote responses.