knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(IRTest) library(ggplot2) library(gridExtra)
IRTest is a useful tool for $\mathcal{\color{red}{IRT}}$ (item response theory) parameter $\mathcal{\color{red}{est}}$imation, especially when the violation of normality assumption on latent distribution is suspected.
IRTest deals with uni-dimensional latent variable.
For missing values, IRTest adopts full information maximum likelihood (FIML) approach.
In IRTest, including the conventional usage of Gaussian distribution, several methods are available for estimation of latent distribution:
The CRAN version of IRTest can be installed on R-console with:
install.packages("IRTest")
For the development version, it can be installed on R-console with:
devtools::install_github("SeewooLi/IRTest")
Followings are the functions of IRTest.
IRTest_Dich
is the estimation function when all items are dichotomously scored.
IRTest_Poly
is the estimation function when all items are polytomously scored.
IRTest_Cont
is the estimation function when all items are continuously scored.
IRTest_Mix
is the estimation function for a mixed-format test, a test comprising both dichotomous item(s) and polytomous item(s).
factor_score
estimates factor scores of examinees.
coef_se
returns standard errors of item parameter estimates.
best_model
selects the best model using an evaluation criterion.
item_fit
tests the statistical fit of all items individually.
inform_f_item
calculates the information value(s) of an item.
inform_f_test
calculates the information value(s) of a test.
plot_item
draws item response function(s) of an item.
reliability
calculates marginal reliability coefficient of IRT.
latent_distribution
returns evaluated PDF value(s) of an estimated latent distribution.
DataGeneration
generates several objects that can be useful for computer simulation studies. Among these are simulated item parameters, ability parameters and the corresponding item-response data.
dist2
is a probability density function of two-component Gaussian mixture distribution.
original_par_2GM
converts re-parameterized parameters of two-component Gaussian mixture distribution into original parameters.
cat_clps
recommends category collapsing based on item parameters (or, equivalently, item response functions).
recategorize
implements the category collapsing.
adaptive_test
estimates ability parameter(s) which can be utilized for computer adaptive testing (CAT).
For S3 methods, anova
, coef
, logLik
, plot
, print
, and summary
are available.
\
\break
The function DataGeneration
can be used in a preparation step.
This function returns a set of artificial data and the true parameters underlying the data.
Alldata <- DataGeneration(model_D = 2, N=1000, nitem_D = 15, latent_dist = "2NM", d = 1.664, sd_ratio = 2, prob = 0.3) data <- Alldata$data_D theta <- Alldata$theta colnames(data) <- paste0("item", 1:15)
\
Mod1 <- IRTest_Dich(data = data, model = 2, latent_dist = "LLS", h=4)
\
### Summary summary(Mod1) ### Log-likelihood logLik(Mod1) ### The estimated item parameters coef(Mod1) ### Standard errors of the item parameter estimates coef_se(Mod1) ### The estimated ability parameters fscore <- factor_score(Mod1, ability_method = "MLE") plot(theta, fscore$theta) abline(b=1, a=0) ### Standard errors of ability parameter estimates plot(fscore$theta, fscore$theta_se)
plot(Mod1, mapping = aes(colour="Estimated"), linewidth = 1) + lims(y = c(0, .75))+ geom_line( mapping=aes( x=seq(-6,6,length=121), y=dist2(seq(-6,6,length=121), prob = .3, d = 1.664, sd_ratio = 2), colour="True"), linewidth = 1)+ labs(title="The estimated latent density using '2NM'", colour= "Type")+ theme_bw()
p1 <- plot_item(Mod1,1) p2 <- plot_item(Mod1,4) p3 <- plot_item(Mod1,8) p4 <- plot_item(Mod1,10) grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)
item_fit(Mod1)
reliability(Mod1)
Each examinee's posterior distribution is identified in the E-step of the estimation algorithm (i.e., EM algorithm).
Posterior distributions can be found in Mod1$Pk
.
set.seed(1) selected_examinees <- sample(1:1000,6) post_sample <- data.frame( X = rep(seq(-6,6, length.out=121),6), prior = rep(Mod1$Ak/(Mod1$quad[2]-Mod1$quad[1]), 6), posterior = 10*c(t(Mod1$Pk[selected_examinees,])), ID = rep(paste("examinee", selected_examinees), each=121) ) ggplot(data=post_sample, mapping=aes(x=X))+ geom_line(mapping=aes(y=posterior, group=ID, color='Posterior'))+ geom_line(mapping=aes(y=prior, group=ID, color='Prior'))+ labs(title="Posterior densities for selected examinees", x=expression(theta), y='density')+ facet_wrap(~ID, ncol=2)+ theme_bw()
ggplot()+ stat_function( fun = inform_f_test, args = list(Mod1) )+ stat_function( fun=inform_f_item, args = list(Mod1, 1), mapping = aes(color="Item 1") )+ stat_function( fun=inform_f_item, args = list(Mod1, 2), mapping = aes(color="Item 2") )+ stat_function( fun=inform_f_item, args = list(Mod1, 3), mapping = aes(color="Item 3") )+ stat_function( fun=inform_f_item, args = list(Mod1, 4), mapping = aes(color="Item 4") )+ stat_function( fun=inform_f_item, args = list(Mod1, 5), mapping = aes(color="Item 5") )+ lims(x=c(-6,6))+ labs(title="Test information function", x=expression(theta), y='information')+ theme_bw()
\break
Alldata <- DataGeneration(model_P = "GRM", categ = rep(c(3,7), each = 7), N=1000, nitem_P = 14, latent_dist = "2NM", d = 1.664, sd_ratio = 2, prob = 0.3) data <- Alldata$data_P theta <- Alldata$theta colnames(data) <- paste0("item", 1:14)
\
Mod1 <- IRTest_Poly(data = data, model = "GRM", latent_dist = "KDE")
\
### Summary summary(Mod1) ### Log-likelihood logLik(Mod1) ### The estimated item parameters coef(Mod1) ### Standard errors of the item parameter estimates coef_se(Mod1) ### The estimated ability parameters fscore <- factor_score(Mod1, ability_method = "MLE") plot(theta, fscore$theta) abline(b=1, a=0) ### Standard errors of ability parameter estimates plot(fscore$theta, fscore$theta_se)
plot(Mod1, mapping = aes(colour="Estimated"), linewidth = 1) + stat_function( fun = dist2, args = list(prob = .3, d = 1.664, sd_ratio = 2), mapping = aes(colour = "True"), linewidth = 1) + lims(y = c(0, .75)) + labs(title="The estimated latent density using '2NM'", colour= "Type")+ theme_bw()
p1 <- plot_item(Mod1,1) p2 <- plot_item(Mod1,4) p3 <- plot_item(Mod1,8) p4 <- plot_item(Mod1,10) grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)
item_fit(Mod1)
reliability(Mod1)
Each examinee's posterior distribution is identified in the E-step of the estimation algorithm (i.e., EM algorithm).
Posterior distributions can be found in Mod1$Pk
.
set.seed(1) selected_examinees <- sample(1:1000,6) post_sample <- data.frame( X = rep(seq(-6,6, length.out=121),6), prior = rep(Mod1$Ak/(Mod1$quad[2]-Mod1$quad[1]), 6), posterior = 10*c(t(Mod1$Pk[selected_examinees,])), ID = rep(paste("examinee", selected_examinees), each=121) ) ggplot(data=post_sample, mapping=aes(x=X))+ geom_line(mapping=aes(y=posterior, group=ID, color='Posterior'))+ geom_line(mapping=aes(y=prior, group=ID, color='Prior'))+ labs(title="Posterior densities for selected examinees", x=expression(theta), y='density')+ facet_wrap(~ID, ncol=2)+ theme_bw()
ggplot()+ stat_function( fun = inform_f_test, args = list(Mod1) )+ stat_function( fun=inform_f_item, args = list(Mod1, 1), mapping = aes(color="Item 1 (3 cats)") )+ stat_function( fun=inform_f_item, args = list(Mod1, 2), mapping = aes(color="Item 2 (3 cats)") )+ stat_function( fun=inform_f_item, args = list(Mod1, 3), mapping = aes(color="Item 3 (3 cats)") )+ stat_function( fun=inform_f_item, args = list(Mod1, 8), mapping = aes(color="Item 8 (7 cats)") )+ stat_function( fun=inform_f_item, args = list(Mod1, 9), mapping = aes(color="Item 9 (7 cats)") )+ stat_function( fun=inform_f_item, args = list(Mod1, 10, "p"), mapping = aes(color="Item10 (7 cats)") )+ lims(x=c(-6,6))+ labs(title="Test information function", x=expression(theta), y='information')+ theme_bw()
\break
\
Beta distribution (click)
$$ \begin{align} f(x) &= \frac{1}{Beta(\alpha, \beta)}x^{\alpha-1}(1-x)^{(\beta-1)} \ &= \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha-1}(1-x)^{(\beta-1)} \end{align} $$
$E(x)=\frac{\alpha}{\alpha+\beta}$ and $Var(x)=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta=1)}$ If we reparameterize $\mu=\frac{\alpha}{\alpha+\beta}$ and $\nu=\alpha+\beta$,
$$ f(x) = \frac{\Gamma(\nu)}{\Gamma(\mu\nu)\Gamma(\nu(1-\mu))}x^{\mu\nu-1}(1-x)^{(\nu(1-\mu)-1)} $$ No Jacobian transformation required since $\mu$ and $\nu$ are parameters of the $f(x)$, not variables.
Useful equations (click)
$\psi(\bullet)$ and $\psi_1(\bullet)$ denote for digamma and trigamma functions, respectively.
$$ \begin{align} E[\log{x}] &= \int_{0}^{1}{\log{x}f(x) \,dx} \ &= \int_{0}^{1}{\log{x} \frac{1}{Beta(\alpha, \beta)}x^{\alpha-1}(1-x)^{(\beta-1)} \,dx} \ &= \frac{1}{Beta(\alpha, \beta)} \int_{0}^{1}{\log{(x)} x^{\alpha-1}(1-x)^{(\beta-1)} \,dx} \ &= \frac{1}{Beta(\alpha, \beta)} \int_{0}^{1}{\frac{\partial x^{\alpha-1}(1-x)^{(\beta-1)}}{\partial \alpha} \,dx} \ &= \frac{1}{Beta(\alpha, \beta)} \frac{\partial}{\partial \alpha}\int_{0}^{1}{ x^{\alpha-1}(1-x)^{(\beta-1)} \,dx} \ &= \frac{1}{Beta(\alpha, \beta)} \frac{\partial Beta(\alpha, \beta)}{\partial \alpha} \ &= \frac{\partial \log{[Beta(\alpha, \beta)]}}{\partial \alpha} \ &= \frac{\partial \log{[\Gamma(\alpha)]}}{\partial \alpha} - \frac{\partial \log{[\Gamma(\alpha + \beta)]}}{\partial \alpha} \ &= \psi(\alpha) - \psi(\alpha+\beta) \end{align} $$
Similarly, $E[\log{(1-x)}]=\psi(\beta) - \psi(\alpha+\beta)$.
Furthermore, using $\frac{\partial Beta(\alpha,\beta)}{\partial \alpha} = Beta(\alpha,\beta)\left(\psi(\alpha)-\psi(\alpha+\beta)\right)$ and $\frac{\partial^2 Beta(\alpha,\beta)}{\partial \alpha^2} = Beta(\alpha,\beta)\left(\psi(\alpha)-\psi(\alpha+\beta)\right)^2 + Beta(\alpha,\beta)\left(\psi_1(\alpha)-\psi_1(\alpha+\beta)\right)$,
$$ \begin{align} E\left[(\log{x})^2\right] &= \frac{1}{Beta(\alpha, \beta)} \frac{\partial^2 Beta(\alpha, \beta)}{\partial \alpha^2} \ &= \left(\psi(\alpha)-\psi(\alpha+\beta)\right)^2 + \left(\psi_1(\alpha)-\psi_1(\alpha+\beta)\right) \end{align} $$
This leads to,
$$ \begin{align} Var\left[\log{x}\right] &= E\left[(\log{x})^2\right] - E\left[\log{x}\right]^2 \ &=\psi_1(\alpha)-\psi_1(\alpha+\beta) \end{align} $$
Continuous IRT (click)
$$ \mu = \frac{e^{a(\theta -b)}}{1+e^{a(\theta -b)}} \ \frac{\partial \mu}{\partial \theta} = a\mu(1-\mu) \ \frac{\partial \mu}{\partial a} = (\theta - b)\mu(1-\mu) \ \frac{\partial \mu}{\partial b} = -a\mu(1-\mu) \ \frac{\partial \mu}{\partial \nu} = 0 $$
$$ f(x)=P(x|\, \theta, a, b, \nu) = \frac{\Gamma(\nu)}{\Gamma(\mu\nu)\Gamma(\nu(1-\mu))} x^{\mu\nu-1} (1-x)^{\nu(1-\mu)-1} \ $$
$$ \log{f} = \log{\Gamma(\nu)}-\log{\Gamma(\mu\nu)}-\log{\Gamma(\nu(1-\mu))} + (\mu\nu-1)\log{x} + (\nu(1-\mu)-1) \log{(1-x)} $$
$$ \frac{\partial \log{f}}{\partial \theta} = a\nu\mu(1-\mu)\left[-\psi{(\mu\nu)}+\psi{(\nu(1-\mu))}+ \log{\left(\frac{x}{1-x}\right)}\right] $$
$$ E\left[ \left(\frac{\partial \log{f}}{\partial \theta}\right)^2 \right] = (a\nu\mu(1-\mu))^2\left[E\left[ \log{\left(\frac{x}{1-x}\right)^2}\right] -2 \left(\psi{(\mu\nu)}-\psi{(\nu(1-\mu))}\right )E\left[ \log{\left(\frac{x}{1-x}\right)}\right] +\left(\psi{(\mu\nu)}-\psi{(\nu(1-\mu))}\right )^2 \right] $$
$$ \begin{align} E\left[ \log{\left(\frac{x}{1-x}\right)^2}\right] &= E\left[ \log{\left(x\right)^2}\right] -2 E\left[ \log{\left(x\right)}\log{\left(1-x\right)}\right] + E\left[ \log{\left(1-x\right)^2}\right] \ &= Var\left[ \log{\left(x\right)}\right]+E\left[ \log{\left(x\right)}\right]^2 \ &\qquad -2 Cov\left[ \log{\left(x\right)}\log{\left(1-x\right)}\right]-2E\left[ \log{\left(x\right)}\right]E\left[ \log{\left(1-x\right)}\right] \ &\qquad + Var\left[ \log{\left(1-x\right)}\right]+E\left[ \log{\left(1-x\right)}\right]^2 \ &= \psi_{1}(\alpha)-\psi_{1}(\alpha+\beta) +E\left[ \log{\left(x\right)}\right]^2 \ &\qquad +2 \psi_{1}(\alpha+\beta)-2E\left[ \log{\left(x\right)}\right]E\left[ \log{\left(1-x\right)}\right] \ &\qquad + \psi_{1}(\beta)-\psi_{1}(\alpha+\beta)+E\left[ \log{\left(1-x\right)}\right]^2 \ &= \psi_{1}(\alpha)-\psi_{1}(\alpha+\beta) +\left[ \psi(\alpha)-\psi(\alpha+\beta)\right]^2 \ &\qquad +2 \psi_{1}(\alpha+\beta)-2 \left(\psi(\alpha)-\psi(\alpha+\beta)\right)\left(\psi(\beta)-\psi(\alpha+\beta)\right) \ &\qquad + \psi_{1}(\beta)-\psi_{1}(\alpha+\beta)+\left[\psi(\beta)-\psi(\alpha+\beta)\right]^2 \ &= \psi_{1}(\alpha) +\psi_{1}(\beta) +\left[\psi(\alpha)-\psi(\beta)\right]^2 \end{align} $$
$$
\begin{align}
E\left[\left(\frac{\partial \log{f}}{\partial \theta}\right)^2 \right] & =
(a\nu\mu(1-\mu))^2\left[E\left[ \log{\left(\frac{x}{1-x}\right)^2}\right]
-2 \left(\psi{(\mu\nu)}-\psi{(\nu(1-\mu))}\right )E\left[ \log{\left(\frac{x}{1-x}\right)}\right]
+\left(\psi{(\mu\nu)}-\psi{(\nu(1-\mu))}\right )^2
\right] \
&= (a\nu\mu(1-\mu))^2\left[\psi_{1}(\alpha) +\psi_{1}(\beta)
+\left[\psi(\alpha)-\psi(\beta)\right]^2
-2 \left(\psi{(\alpha)}-\psi{(\beta)}\right )\left(\psi{(\alpha)}-\psi{(\beta)}\right )
+\left(\psi{(\alpha)}-\psi{(\beta)}\right )^2
\right] \
&= (a\nu\mu(1-\mu))^2\left[\psi_{1}(\alpha) +\psi_{1}(\beta)
\right] \
\end{align}
$$
$$ I(\theta) = E\left[\left(\frac{\partial \log{f}}{\partial \theta}\right)^2 \right] = (a\nu\mu(1-\mu))^2\left[\psi_{1}(\alpha) +\psi_{1}(\beta)\right] $$
Marginal log-likelihood of an item can be expressed as follows:
$$ \ell = \sum_{j} \sum_{q}\gamma_{jq}\log{L_{jq}}, $$
where $\gamma_{jq}=E\left[\Pr\left(\theta_j \in \theta_{q}^{}\right)\right]$ is the expected probability of respondent $j$'s ability ($\theta_j$) belonging to the $\theta_{q}^{}$ of the quadrature scheme and is calculated at the E-step of the MML-EM procedure, and $L_{jq}$ is the likelihood of respondent $j$'s response at $\theta_{q}^{*}$ for the item of current interest.
$$ \frac{\partial \ell}{\partial a} = \sum_{q} \left(\theta_{q}-b\right)\nu\mu_{q}\left(1-\mu_{q}\right)\left[ S_{1q}-S_{2q}-f_q\left[ \psi(\mu_{q}\nu)-\psi(\nu(1-\mu_{q})) \right] \right] \ \frac{\partial \ell}{\partial b} = -a\sum_{q}\nu\mu_{q}\left(1-\mu_{q}\right)\left[ S_{1q}-S_{2q}-f_q\left[ \psi(\mu_{q}\nu)-\psi(\nu(1-\mu_{q})) \right] \right] \ \frac{\partial \ell}{\partial \nu} = N\psi(\nu) +\sum_{q}\left[ \mu_{q}(S_{1q}-f_q\psi(\mu_{q}\nu)) + (1-\mu_{q})(S_{2q}-f_q\psi(\nu(1-\mu_{q}))) \right] $$
where $S_{1q} = \sum_{j}{\gamma_{jq}\log{x_j}}$ and $S_{2q} = \sum_{j}{\gamma_{jq}\log{(1-x_j)}}$. Since $E_q[S_{1q}]=f_q\left[\psi(\mu_{q}\nu))-\psi(\nu)\right]$ and $E_q[S_{2q}]=f_q\left[\psi(\nu(1-\mu_{q})))-\psi(\nu)\right]$, the expected values of the first derivatives are 0.
To keep $\nu$ positive, let $\nu = \exp{\xi}$; $\frac{\partial\nu}{\partial\xi}=\exp{\xi}=\nu$.
$$ \frac{\partial \ell}{\partial \xi} = N\nu\psi(\nu) +\nu\sum_{q}\left[ \mu_{q}(S_{1q}-f_q\psi(\mu_{q}\nu)) + (1-\mu_{q})(S_{2q}-f_q\psi(\nu(1-\mu_{q}))) \right] $$
$$ E\left( \frac{\partial^2\ell}{\partial a^2}\right) = -\sum_{q} \left{\left(\theta_{q}-b\right)\nu\mu_{q}\left(1-\mu_{q}\right)\right}^{2}f_{q}\left[ \psi_{1}(\mu_{q}\nu)+\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial a \partial b}\right) = a\sum_{q} \left(\theta_{q}-b\right)\left{\nu\mu_{q}\left(1-\mu_{q}\right)\right}^{2}f_{q}\left[ \psi_{1}(\mu_{q}\nu)+\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial a \partial \nu}\right) = -\sum_{q} \left(\theta_{q}-b\right)\nu\mu_{q}\left(1-\mu_{q}\right)f_q\left[ \mu_{q}\psi_{1}(\mu_{q}\nu)-(1-\mu_{q})\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial b^2}\right) = -a^{2}\sum_{q} \left{\nu\mu_{q}\left(1-\mu_{q}\right)\right}^{2}f_{q}\left[ \psi_{1}(\mu_{q}\nu)+\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial b \partial \nu}\right) = a\sum_{q} \nu\mu_{q}\left(1-\mu_{q}\right)f_q\left[ \mu_{q}\psi_{1}(\mu_{q}\nu)-(1-\mu_{q})\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial \nu^2}\right) = N\psi_{1}(\nu) - \sum_{q}f_q\left[ \mu_{q}^{2}\psi_{1}(\mu_{q}\nu)+(1-\mu_{q})^{2}\psi_{1}(\nu(1-\mu_{q})) \right] $$
If we use $\xi$ instead of $\nu$,
$$ E\left(\frac{\partial^2\ell}{\partial a \partial \xi}\right) = -\sum_{q} \left(\theta_{q}-b\right)\nu^{2}\mu_{q}\left(1-\mu_{q}\right)f_q\left[ \mu_{q}\psi_{1}(\mu_{q}\nu)-(1-\mu_{q})\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial b \partial \xi}\right) = a\sum_{q} \nu^{2}\mu_{q}\left(1-\mu_{q}\right)f_q\left[ \mu_{q}\psi_{1}(\mu_{q}\nu)-(1-\mu_{q})\psi_{1}(\nu(1-\mu_{q})) \right] \ E\left(\frac{\partial^2\ell}{\partial \xi^2}\right) = N\nu^{2}\psi_{1}(\nu) - \nu^{2}\sum_{q}f_q\left[ \mu_{q}^{2}\psi_{1}(\mu_{q}\nu)+(1-\mu_{q})^{2}\psi_{1}(\nu(1-\mu_{q})) \right] $$
\ \
The function DataGeneration
can be used in a preparation step.
This function returns a set of artificial data and the true parameters underlying the data.
Alldata <- DataGeneration(N=1000, nitem_C = 8, latent_dist = "2NM", a_l = .3, a_u = .7, d = 1.664, sd_ratio = 2, prob = 0.3) data <- Alldata$data_C theta <- Alldata$theta colnames(data) <- paste0("item", 1:8)
\
Mod1 <- IRTest_Cont(data = data, latent_dist = "KDE")
\
### Summary summary(Mod1) ### Log-likelihood logLik(Mod1) ### The estimated item parameters coef(Mod1) ### The estimated ability parameters fscore <- factor_score(Mod1, ability_method = "WLE") plot(theta, fscore$theta) abline(b=1, a=0) ### Standard errors of ability parameter estimates plot(fscore$theta, fscore$theta_se)
plot(Mod1, mapping = aes(colour="Estimated"), linewidth = 1) + lims(y = c(0, .75))+ geom_line( mapping=aes( x=seq(-6,6,length=121), y=dist2(seq(-6,6,length=121), prob = .3, d = 1.664, sd_ratio = 2), colour="True"), linewidth = 1)+ labs(title="The estimated latent density using '2NM'", colour= "Type")+ theme_bw()
p1 <- plot_item(Mod1,1) p2 <- plot_item(Mod1,2) p3 <- plot_item(Mod1,3) p4 <- plot_item(Mod1,4) grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)
reliability(Mod1)
Each examinee's posterior distribution is identified in the E-step of the estimation algorithm (i.e., EM algorithm).
Posterior distributions can be found in Mod1$Pk
.
set.seed(1) selected_examinees <- sample(1:1000,6) post_sample <- data.frame( X = rep(seq(-6,6, length.out=121),6), prior = rep(Mod1$Ak/(Mod1$quad[2]-Mod1$quad[1]), 6), posterior = 10*c(t(Mod1$Pk[selected_examinees,])), ID = rep(paste("examinee", selected_examinees), each=121) ) ggplot(data=post_sample, mapping=aes(x=X))+ geom_line(mapping=aes(y=posterior, group=ID, color='Posterior'))+ geom_line(mapping=aes(y=prior, group=ID, color='Prior'))+ labs(title="Posterior densities for selected examinees", x=expression(theta), y='density')+ facet_wrap(~ID, ncol=2)+ theme_bw()
ggplot()+ stat_function( fun = inform_f_test, args = list(Mod1) )+ stat_function( fun=inform_f_item, args = list(Mod1, 1), mapping = aes(color="Item 1") )+ stat_function( fun=inform_f_item, args = list(Mod1, 2), mapping = aes(color="Item 2") )+ stat_function( fun=inform_f_item, args = list(Mod1, 3), mapping = aes(color="Item 3") )+ stat_function( fun=inform_f_item, args = list(Mod1, 4), mapping = aes(color="Item 4") )+ stat_function( fun=inform_f_item, args = list(Mod1, 5), mapping = aes(color="Item 5") )+ stat_function( fun=inform_f_item, args = list(Mod1, 6), mapping = aes(color="Item 6") )+ stat_function( fun=inform_f_item, args = list(Mod1, 7), mapping = aes(color="Item 7") )+ stat_function( fun=inform_f_item, args = list(Mod1, 8), mapping = aes(color="Item 8") )+ lims(x=c(-6,6))+ labs(title="Test information function", x=expression(theta), y='information')+ theme_bw()
\break
As in the cases of dichotomous and polytomous items, the function DataGeneration
can be used in the preparation step.
This function returns artificial data and some useful objects for analysis (i.e., theta
, data_D
, item_D
, data_P
, & item_P
).
Alldata <- DataGeneration(model_D = 2, model_P = "GRM", N=1000, nitem_D = 10, nitem_P = 5, latent_dist = "2NM", d = 1.664, sd_ratio = 1, prob = 0.5) DataD <- Alldata$data_D DataP <- Alldata$data_P theta <- Alldata$theta colnames(DataD) <- paste0("item", 1:10) colnames(DataP) <- paste0("item", 1:5)
\ \ \
Mod1 <- IRTest_Mix(data_D = DataD, data_P = DataP, model_D = "2PL", model_P = "GRM", latent_dist = "KDE")
\ \ \
### Summary summary(Mod1) ### Log-likelihood logLik(Mod1) ### The estimated item parameters coef(Mod1) ### Standard errors of the item parameter estimates coef_se(Mod1) ### The estimated ability parameters fscore <- factor_score(Mod1, ability_method = "MLE") plot(theta, fscore$theta) abline(b=1, a=0) ### Standard errors of ability parameter estimates plot(fscore$theta, fscore$theta_se)
plot(Mod1, mapping = aes(colour="Estimated"), linewidth = 1) + stat_function( fun = dist2, args = list(prob = .5, d = 1.664, sd_ratio = 1), mapping = aes(colour = "True"), linewidth = 1) + lims(y = c(0, .75)) + labs(title="The estimated latent density using '2NM'", colour= "Type")+ theme_bw()
p1 <- plot_item(Mod1,1, type="d") p2 <- plot_item(Mod1,4, type="d") p3 <- plot_item(Mod1,8, type="d") p4 <- plot_item(Mod1,10, type="d") grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)
p1 <- plot_item(Mod1,1, type="p") p2 <- plot_item(Mod1,2, type="p") p3 <- plot_item(Mod1,3, type="p") p4 <- plot_item(Mod1,4, type="p") grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)
item_fit(Mod1)
reliability(Mod1)
Each examinee's posterior distribution is identified in the E-step of the estimation algorithm (i.e., EM algorithm).
Posterior distributions can be found in Mod1$Pk
.
set.seed(1) selected_examinees <- sample(1:1000,6) post_sample <- data.frame( X = rep(seq(-6,6, length.out=121),6), prior = rep(Mod1$Ak/(Mod1$quad[2]-Mod1$quad[1]), 6), posterior = 10*c(t(Mod1$Pk[selected_examinees,])), ID = rep(paste("examinee", selected_examinees), each=121) ) ggplot(data=post_sample, mapping=aes(x=X))+ geom_line(mapping=aes(y=posterior, group=ID, color='Posterior'))+ geom_line(mapping=aes(y=prior, group=ID, color='Prior'))+ labs(title="Posterior densities for selected examinees", x=expression(theta), y='density')+ facet_wrap(~ID, ncol=2)+ theme_bw()
ggplot()+ stat_function( fun = inform_f_test, args = list(Mod1) )+ stat_function( fun=inform_f_item, args = list(Mod1, 1, "d"), mapping = aes(color="Dichotomous Item 1") )+ stat_function( fun=inform_f_item, args = list(Mod1, 2, "d"), mapping = aes(color="Dichotomous Item 2") )+ stat_function( fun=inform_f_item, args = list(Mod1, 3, "d"), mapping = aes(color="Dichotomous Item 3") )+ stat_function( fun=inform_f_item, args = list(Mod1, 1, "p"), mapping = aes(color="Polytomous Item 1") )+ stat_function( fun=inform_f_item, args = list(Mod1, 2, "p"), mapping = aes(color="Polytomous Item 2") )+ stat_function( fun=inform_f_item, args = list(Mod1, 3, "p"), mapping = aes(color="Polytomous Item 3") )+ lims(x=c(-6,6))+ labs(title="Test information function", x=expression(theta), y='information')+ theme_bw()
\break
data <- DataGeneration(N=1000, nitem_D = 10, latent_dist = "2NM", d = 1.664, sd_ratio = 2, prob = 0.3)$data_D
model_fits <- list() model_fits[[1]] <- IRTest_Dich(data) model_fits[[2]] <- IRTest_Dich(data, latent_dist = "EHM") model_fits[[3]] <- IRTest_Dich(data, latent_dist = "2NM") model_fits[[4]] <- IRTest_Dich(data, latent_dist = "KDE") for(i in 1:10){ model_fits[[i+4]] <- IRTest_Dich(data, latent_dist = "DC", h = i) } names(model_fits) <- c("Normal", "EHM", "2NM", "KDM", paste0("DC", 1:10))
do.call(what = "anova", args = model_fits[5:14]) do.call(what = "best_model", args = model_fits[5:14])
do.call(what = "best_model", args = c(model_fits[c(1:4,5)], criterion ="AIC"))
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.