WellsBolt: Wells and Bolt procedure for identifying item misfit.
In falkcarl/mpirt: Semi- and non-parametric item response theory

WellsBolt

R Documentation

Wells and Bolt procedure for identifying item misfit.

Description

Wells and Bolt procedure for identifying item misfit.

Usage

WellsBolt(
  MPmod,
  dat,
  nrep = 500,
  kmax = 3,
  theta = seq(-4, 4, length.out = 500),
  w = dnorm(theta)/sum(dnorm(theta)),
  parallel = c("furrr", "lapply"),
  ncores = 2,
  seed = 5224L,
  p.adjust.method = NULL,
  ...
)

Arguments

`MPmod`	Fitted `mxModel`, e.g., from `simAnneal`.
`dat`	Original data.
`nrep`	Number of replications for parametric bootstrap.
`kmax`	Max for integer that controls polynomial order.
`theta`	Grid across theta over which to perform estimation for Bolt (2002)'s approach.
`w`	Weights corresponding to `esttheta` for performing estimation for Bolt (2002)'s approach.
`parallel`	Method for parallel processing, if any. `"lapply"`, which does not actually do parallel processing or `"furrr"`, which uses the `furrr` package with the number of processing cores specified by `ncores`.
`ncores`	Number of processing cores to use for parallel processing.
`seed`	Integer for setting seed when doing multicore processing.
`p.adjust.method`	Optional argument passed to `p.adjust` for making adjustments to p-values (e.g., Bonferroni, Bejamini-Hochberg).
`...`	Arguments passed to `simAnneal`.

Details

THIS FUNCTION IS EXPERIMENTAL: additional functionality will be added, which may change the function signature and capabilities of the function. The function is also very COMPUTATIONALLY INTENSIVE (i.e., slow).

Procedure in Wells and Bolt (orig Douglas and Cohen) to identify item misfit. This requires that a non- or semi-parametric model is first fit to the data. So far, MP-based item models are supported and must be fit by a separate function (e.g., simAnneal). The procedure from Bolt (2002) is followed to obtain a parametric model for each item that bests fits each non- or semi-parametrically estimated response function. This typically involves fitting the parametric model item-by-item to the estimated response function from the non- or semi-parametric model. Currently the graded response model is used as the parametric model for this purpose, optimization or fitting is done over a grid for theta with weights from a standard normal distribution (as this is typical for calibration), and nlminb is used for model fitting with numerical derivatives (the form of the log-likelihood for fitting is given by Bolt, 2002). A discrepancy between the non/semi-parametric model and this newly fitted parametric model can then be determined, such as RIMSD (functionality for IAD or ISD may be forthcoming). This value represents the best that the parametric model can get to fitting the non/semi-parametric function, or in other words, how much the non/semi-parametric response function differs from this parametric model. However, the sampling distribution for this discrepancy is unknown.

To then obtain the sampling distribution for the discrepancy (currently RIMSD) and obtain a p-value, what resembles a parametric bootstrap is performed: 1. Generate M datasets under the parametric model estimated using the procedure from Bolt (2002) as described above. Here, we generate data with characteristics (sample size and pattern of missing data) that are identical to the original dataset. 2. Fit non- or semi-parametric approach to each simulated dataset. Here, we use the simAnneal with some defaults chosen to be computationally efficient and currently only the graded version of the MP model is possible (support may change in future versions of this function). The defaults currently are the following, and cannot yet be changed: itermax=500, inittemp=5,type="aic",pvar=500,taumean=-1,temptype="logarithmic",items=1. 3. For each estimated model, the procedure by Bolt (2002) is again used to obtain a discrepancy value (RIMSD). 4. Since computation of RIMSD here is essentially under the null hypothesis that the true response function is the parametric model, then the obtained p-value for each RIMSD can be obtained (i.e., how far in the tail is our observed value?).

Value

A list with the following elements

Slots

boltModel: A list that essentially contains the information from the Bolt (2002) procedure. See bolt2002Model
dif: A vector that provides the discrepancy between the non/semi-parametric model from the Bolt (2002) procedure.
pval: A vector that provides p-values for the discrepancy measure.
bootResults: A matrix that contains the results of the discrepancy measure from the parametric bootstrap. Replications are rows, columns are items.

Examples



# For now, just load something from mirt
#library(mirt)
data(Science)

dat <- mxFactor(Science,levels=1:4)
safit <- simAnneal(dat, k.mat=newkmat(0,2,4),
                   itermax = 4*6,
                   inittemp = 5,
                   type = "aic",
                   step = 1,
                   items = 1,
                   temptype = "logarithmic",
                   itemtype=rep("grmp",4))

samod <- safit$bestmod # best model according to SA

getkrec(samod, 4) # value of k for each item from best model

# Estimation settings similar to SA, but fewer iterations
# If generating under the graded model, fewer iterations should be necessary anyway
# 100 replications also probably not enough to get very accurate p-values
WB <- WellsBolt(samod, dat, nrep=100, kmax=2, seq(-4,4,length.out=81),
               parallel="furrr", ncores=2,
               itermax = 12,
               inittemp = 5,
               type = "aic",
               step = 1,
               items = 1,
               temptype = "logarithmic"
               )

WB$dif # observed RMSD values
WB$pval # p-values

falkcarl/mpirt documentation built on July 11, 2024, 12:09 a.m.