Description Usage Arguments Details Value Author(s) Examples
Implements a survey-weighted marginal maximum estimation, a type of regression where the outcome is a latent trait (such as student ability. Instead of using an estimate, the likelihood function marginalizes student ability. Includes a variety of variance estimation strategies.
1 2 3 4 |
formula |
a formula object in the style of |
stuItems |
a list where each element is named a student ID and contains
a |
stuDat |
a |
paramTab |
a |
Q |
the number of integration points |
polyModel |
polytomous response model;
one of |
regType |
one of |
weightvar |
a variable name on |
control |
a list with four elements that control the fitting process. See Details. |
idVar |
a variable name on |
missingCode |
the value a score is set to that indicates the item is missing.
An item scored as |
missingValue |
the value to set items scored as |
multiCore |
allows the |
bobyqaControl |
a list that gets passed to |
The mml
function models a latent outcome conditioning on student
item response data, student covariate data, and item parameter information;
these three parts are broken up into three arguments.
Student item response data go into stuItems
, whereas student
covariates, weights, and sampling information go into stuDat
.
The paramTab
contains item parameter information for each item—the result of a
separate item parameter scaling. In the case of
the National Assessment of Educational Progress (NAEP),
they can be found online, for example, at
https://nces.ed.gov/nationsreportcard/tdw/analysis/scaling_irt.aspx.
The model for dichotomous responses data is by default three Parameter Logit
(3PL), unless the item parameter information provided by users suggests
otherwise. For example, if the scaling used a two Parameter Logit (2PL) model,
then the guessing parameter can simply be set to zero. For polytomous
responses data, the model is dictated by the polyModel
argument.
Student data are broken up into two parts. The item response data goes
into stuItems
,and the student covariates for the formula go into
stuDat
. Information about items, such as item difficulties, is in
paramTab
. All dichotomous items are assumed to be
3PL, though by setting the guessing parameter to zero, the user
can use a 2PL or the one Parameter Logit (1PL) or Rasch models.
The model for polytomous responses data is dictated by the polyModel
argument.
The marginal maximum likelihood then integrates the product of the student
ability from the assessment data, and the estimate from the linear model
estimates each student's ability based on the formula
provided
and a residual standard error term. This integration happens from the
minimum node to the maximum node in the control
argument (described
later in this section) with Q
quadrature points.
The stuItems
argument has the scored student data. It is a list where
each element is named with student ID and contains
a data.frame
with at least two columns.
The first required column is named
key
and shows the item name as it appears in paramTab
;
the second column in named
score
and shows the score for that item. For binomial
items, the score
is 0 or 1. For GPCM
items, the scores
start at zero as well. For GRM
, the scores start at 1.
The paramTab
argument is a data.frame
with a column named
ItemID
that agrees with
the key
column in the stuItems
argument,
and, for a 3PL item, columns P0
,
P1
, and P2
for the “a”, “d”, and
“g” parameters, respectively; see the vignette for details of
the 3PL model.
For a GPCM
model, P0
is the “a” parameter, and the other
columns are the “d” parameters; see the vignette for details
of the GPCM model.
The control
argument is a list with, optional, items D
, the
scale parameter, that defaults to 1.7; startVal
, which is the starting
value for the coefficients; and min.node
and max.node
, which
sets the range of nodes for all students; these default to
-4 and 4, respectively. The quadrature points then are a range
from min.node
to max.node
with a total of Q
nodes.
object of class mml.means
.
This is a list with elements:
call |
the call used to generate this |
coefficients |
the marginal maximum likelihood regression coefficients, including the estimated residual standard error |
LogLik |
the log-likelihood of the fit model |
X |
the design matrix of the marginal maximum likelihood regression |
Convergence |
a convergence note from the |
location |
used for scaling the estimates |
scale |
used for scaling the estimates |
lnlf |
the likelihood function |
rr1 |
the density function of each individual, conditional only on item responses in |
stuDat |
the |
weightvar |
the weight variable |
nodes |
the nodes the likelihood was evaluated on |
iterations |
the number of iterations required to reach convergence |
obs |
the number of observations used |
Harold Doran, Paul Bailey, Claire Kelley, and Sun-joo Lee
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | ## Not run:
# get NAEP Primer data
require(EdSurvey)
# data
sdf <- readNAEP(system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
cols <- c("m066401", "m093701", "m086001", "m051901", "m067801", "m046501",
"origwt", "repgrp1", "jkunit", "dsex")
data <- getData(sdf, varnames=cols, addAttributes=TRUE,
omittedLevels=FALSE, defaultConditions=FALSE,
returnJKreplicates=FALSE)
# 3PL items only:
# P0 is the discrimination parameter (a),
# P1 is the item difficulty (d),
# P2 is the guessing parameter (g)
# polytomous responses could use P3-P10 for more difficulties
paramTab <- structure(list(ItemID = c("m066401", "m093701", "m086001",
"m051901", "m067801", "m046501"),
P0 = c(0.68, 1.22, 1.05, 1.6, 0.86, 1.03),
P1 = c(-0.33, 1.81, 1, 0.61, -1.61, -0.14),
P2 = c(0.15, 0.17, 0.22, 0.08, 0.06, 0.37),
P3 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P4 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P5 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P6 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P7 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P8 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P9 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
P10 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_),
ScorePoints = c(1, 1, 1, 1, 1, 1),
MODEL = c("3pl", "3pl", "3pl", "3pl", "3pl", "3pl")),
row.names = c(1L, 3L, 4L, 5L, 9L, 13L),
class = "data.frame", location = 277.1563, scale = 37.7297)
# scores an item as correct if it contains an asterisk and as skipped if it
# is "Omitted", "Not Reached", or "Multiple". The value NA is left as NA.
# this score function is intended to be simple not reflect typical NAEP scoring.
simpleScore <- function(col) {
score0 <- 0+grepl("*", col, fixed=TRUE)
score1 <- ifelse(col %in% c("Omitted", "Not Reached", "Multiple"), 8, score0)
score2 <- ifelse(col %in% NA, NA, score1)
return(score2)
}
# score each item in paramTab
for(name in paramTab$ItemID){
# show score output vs input data
print(table(sdf[,name], simpleScore(sdf[,name]), useNA="ifany"))
# score item
data[,name] <- simpleScore(data[,name])
}
# make stuItems
data$id <- 1:nrow(data)
# first make a long data.frame of the item score data
stuItems <- reshape(data=data, varying=c(paramTab$ItemID), idvar=c("id"),
direction="long", v.names="score", times=paramTab$ItemID,
timevar="key")[,c("id", "key", "score")]
# then break it up into a single data.frame per student
stuItems <- split(stuItems, "id")
# Studat is the student covariates, weights, and sampling information
# used for variance estimation
stuDat <- data[, c('origwt', 'repgrp1', 'jkunit', 'dsex', 'id')]
### MML call
mml1 <- mml(~dsex, stuItems=stuItems,
stuDat=stuDat, paramTab=paramTab,
regType = 'regression', Q=34, idVar="id", weightvar = "origwt")
# summary, assumes the sample was drawn IID
summary(mml1)
# summary, accounts for correlation between students in the same schools
summary(mml1, varType="Taylor", stratavar="repgrp1", psuvar="jkunit")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.