View source: R/MPdifferencefunctions.R
WellsBolt | R Documentation |
Wells and Bolt procedure for identifying item misfit.
WellsBolt(
MPmod,
dat,
nrep = 500,
kmax = 3,
theta = seq(-4, 4, length.out = 500),
w = dnorm(theta)/sum(dnorm(theta)),
parallel = c("furrr", "lapply"),
ncores = 2,
seed = 5224L,
p.adjust.method = NULL,
...
)
MPmod |
Fitted |
dat |
Original data. |
nrep |
Number of replications for parametric bootstrap. |
kmax |
Max for integer that controls polynomial order. |
theta |
Grid across theta over which to perform estimation for Bolt (2002)'s approach. |
w |
Weights corresponding to |
parallel |
Method for parallel processing, if any. |
ncores |
Number of processing cores to use for parallel processing. |
seed |
Integer for setting seed when doing multicore processing. |
p.adjust.method |
Optional argument passed to |
... |
Arguments passed to |
THIS FUNCTION IS EXPERIMENTAL: additional functionality will be added, which may change the function signature and capabilities of the function. The function is also very COMPUTATIONALLY INTENSIVE (i.e., slow).
Procedure in Wells and Bolt (orig Douglas and Cohen) to identify item misfit. This requires that a non- or semi-parametric
model is first fit to the data. So far, MP-based item models are supported and must be fit by a separate function
(e.g., simAnneal
). The procedure from Bolt (2002) is followed to obtain a parametric model for each item
that bests fits each non- or semi-parametrically estimated response function. This typically involves fitting the
parametric model item-by-item to the estimated response function from the non- or semi-parametric model.
Currently the graded response model is used as the parametric model for this purpose, optimization or fitting is done
over a grid for theta with weights from a standard normal distribution (as this is typical for calibration), and nlminb
is used for model fitting with numerical derivatives (the form of the log-likelihood for fitting is given by Bolt, 2002).
A discrepancy between the non/semi-parametric model and this newly fitted parametric model can then be determined,
such as RIMSD (functionality for IAD or ISD may be forthcoming). This value represents the best that the parametric model
can get to fitting the non/semi-parametric function, or in other words, how much the non/semi-parametric response function
differs from this parametric model. However, the sampling distribution for this discrepancy is unknown.
To then obtain the sampling distribution for the discrepancy (currently RIMSD) and obtain a p-value, what resembles a parametric
bootstrap is performed:
1. Generate M datasets under the parametric model estimated using the procedure from Bolt (2002) as described above. Here, we
generate data with characteristics (sample size and pattern of missing data) that are identical to the original dataset.
2. Fit non- or semi-parametric approach to each simulated dataset. Here, we use the simAnneal
with some defaults chosen
to be computationally efficient and currently only the graded version of the MP model is possible (support may change in future
versions of this function). The defaults currently are the following, and cannot yet be changed: itermax=500,
inittemp=5,type="aic",pvar=500,taumean=-1,temptype="logarithmic",items=1
.
3. For each estimated model, the procedure by Bolt (2002) is again used to obtain a discrepancy value (RIMSD).
4. Since computation of RIMSD here is essentially under the null hypothesis that the true response function is the parametric model,
then the obtained p-value for each RIMSD can be obtained (i.e., how far in the tail is our observed value?).
A list with the following elements
boltModel
A list that essentially contains the information from the Bolt (2002) procedure. See bolt2002Model
dif
A vector that provides the discrepancy between the non/semi-parametric model from the Bolt (2002) procedure.
pval
A vector that provides p-values for the discrepancy measure.
bootResults
A matrix that contains the results of the discrepancy measure from the parametric bootstrap. Replications are rows, columns are items.
# For now, just load something from mirt
#library(mirt)
data(Science)
dat <- mxFactor(Science,levels=1:4)
safit <- simAnneal(dat, k.mat=newkmat(0,2,4),
itermax = 4*6,
inittemp = 5,
type = "aic",
step = 1,
items = 1,
temptype = "logarithmic",
itemtype=rep("grmp",4))
samod <- safit$bestmod # best model according to SA
getkrec(samod, 4) # value of k for each item from best model
# Estimation settings similar to SA, but fewer iterations
# If generating under the graded model, fewer iterations should be necessary anyway
# 100 replications also probably not enough to get very accurate p-values
WB <- WellsBolt(samod, dat, nrep=100, kmax=2, seq(-4,4,length.out=81),
parallel="furrr", ncores=2,
itermax = 12,
inittemp = 5,
type = "aic",
step = 1,
items = 1,
temptype = "logarithmic"
)
WB$dif # observed RMSD values
WB$pval # p-values
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.