CDM | R Documentation |
A function to estimate parameters for cognitive diagnosis models by MMLE/EM (de la Torre, 2009; de la Torre, 2011)
or MMLE/BM (Ma & Jiang, 2020) algorithm.The function imports various functions from the GDINA
package,
parameter estimation for Cognitive Diagnostic Models (CDMs) was performed and extended. The CDM
function not
only accomplishes parameter estimation for most commonly used models (e.g., GDINA
, DINA
, DINO
,
ACDM
, LLM
, or rRUM
). Furthermore, it incorporates Bayes modal estimation
(BM; Ma & Jiang, 2020) to obtain more reliable estimation results, especially in small sample sizes.
The monotonic constraints are able to be satisfied.
CDM(
Y,
Q,
model = "GDINA",
method = "EM",
mono.constraint = TRUE,
maxitr = 2000,
verbose = 1
)
Y |
A required |
Q |
A required binary |
model |
Type of model to be fitted; can be |
method |
Type of method to estimate CDMs' parameters; one out of |
mono.constraint |
Logical indicating whether monotonicity constraints should be fulfilled in estimation.
Default = |
maxitr |
A vector for each item or nonzero category, or a scalar which will be used for all items
to specify the maximum number of EM or BM cycles allowed. Default = |
verbose |
Can be |
CDMs are statistical models that fully integrates cognitive structure variables, which define the response probability of examinees on items by assuming the mechanism between attributes. In the dichotomous test, this probability is the probability of answering correctly. According to the specificity or generality of CDM assumptions, it can be divided into reduced CDM and saturated CDM.
Reduced CDMs possess specific assumptions about the mechanisms of attribute interactions, leading to clear interactions between attributes. Representative reduced models include the Deterministic Input, Noisy and Gate (DINA) model (Haertel, 1989; Junker & Sijtsma, 2001; de la Torre & Douglas, 2004), the Deterministic Input, Noisy or Gate (DINO) model (Templin & Henson, 2006), and the Additive Cognitive Diagnosis Model (A-CDM; de la Torre, 2011), the reduced Reparametrized Unified Model (rRUM; Hartz, 2002), among others. Compared to reduced models, saturated models, such as the Log-Linear Cognitive Diagnosis Model (LCDM; Henson et al., 2009) and the general Deterministic Input, Noisy and Gate model (G-DINA; de la Torre, 2011), do not have strict assumptions about the mechanisms of attribute interactions. When appropriate constraints are applied, saturated models can be transformed into various reduced models (Henson et al., 2008; de la Torre, 2011).
The LCDM is a saturated CDM fully proposed within the framework of cognitive diagnosis. Unlike reduced models that only discuss the main effects of attributes, it also considers the interaction between attributes, thus having more generalized assumptions about attributes. Its definition of the probability of correct response is as follows:
P(X_{pi}=1|\boldsymbol{\alpha}_{l}) =
\frac{\exp \left[\lambda_{i0} + \boldsymbol{\lambda}_{i}^{T} \boldsymbol{h} (\boldsymbol{q}_{i}, \boldsymbol{\alpha}_{l}) \right]}
{1 + \exp \left[\lambda_{i0} + \boldsymbol{\lambda}_{i}^{T} \boldsymbol{h} (\boldsymbol{q}_{i}, \boldsymbol{\alpha}_{l}) \right]}
\boldsymbol{\lambda}_{i}^{T} \boldsymbol{h}(\boldsymbol{q}_{i}, \boldsymbol{\alpha}_{l}) =
\sum_{k=1}^{K^\ast}\lambda_{ik}\alpha_{lk} +\sum_{k=1}^{K^\ast-1}\sum_{k'=k+1}^{K^\ast}
\lambda_{ikk'}\alpha_{lk}\alpha_{lk'} +
\cdots + \lambda_{12 \cdots K^\ast}\prod_{k=1}^{K^\ast}\alpha_{lk}
Where, P(X_{pi}=1|\boldsymbol{\alpha}_{l})
represents the probability of an examinee with attribute mastery
pattern \boldsymbol{\alpha}_{l}
(l=1,2,\cdots,L
and L=2^{K^\ast}
) correctly answering item i.
Here, K^\ast = \sum_{k=1}^{K} q_{ik}
denotes the number of attributes in the collapsed q-vector, \lambda_{i0}
is the
intercept parameter, and \boldsymbol{\lambda}_{i}=(\lambda_{i1}, \lambda_{i2}, \cdots, \lambda_{i12},
\cdots, \lambda_{i12{\cdots}K^\ast})
represents the effect vector of the attributes. Specifically,
\lambda_{ik}
is the main effect of attribute k
, \lambda_{ikk'}
is the interaction effect between
attributes k
and k'
, and \lambda_{j12{\cdots}K^\ast}
represents the interaction effect of all required attributes.
The G-DINA, proposed by de la Torre (2011), is another saturated model that offers three types of link functions: identity link, log link, and logit link, which are defined as follows:
P(X_{pi}=1|\boldsymbol{\alpha}_{l}) =
\delta_{i0} + \sum_{k=1}^{K^\ast}\delta_{ik}\alpha_{lk} +\sum_{k=1}^{K^\ast-1}\sum_{k'=k+1}^{K^\ast}\delta_{ikk'}\alpha_{lk}\alpha_{lk'} +
\cdots + \delta_{12{\cdots}K^\ast}\prod_{k=1}^{K^\ast}\alpha_{lk}
log \left[P(X_{pi}=1|\boldsymbol{\alpha}_{l}) \right] =
v_{i0} + \sum_{k=1}^{K^\ast}v_{ik}\alpha_{lk} +\sum_{k=1}^{K^\ast-1}\sum_{k'=k+1}^{K^\ast}v_{ikk'}\alpha_{lk}\alpha_{lk'} +
\cdots + v_{12{\cdots}K^\ast}\prod_{k=1}^{K^\ast}\alpha_{lk}
logit \left[P(X_{pi}=1|\boldsymbol{\alpha}_{l}) \right] =
\lambda_{i0} + \sum_{k=1}^{K^\ast}\lambda_{ik}\alpha_{lk} +\sum_{k=1}^{K^\ast-1}\sum_{k'=k+1}^{K^\ast}\lambda_{ikk'}\alpha_{lk}\alpha_{lk'} +
\cdots + \lambda_{12{\cdots}K^\ast}\prod_{k=1}^{K^\ast}\alpha_{lk}
Where \delta_{i0}
, v_{i0}
, and \lambda_{i0}
are the intercept parameters for the three
link functions, respectively; \delta_{ik}
, v_{ik}
, and \lambda_{ik}
are the main effect
parameters of \alpha_{lk}
for the three link functions, respectively; \delta_{ikk'}
, v_{ikk'}
,
and \lambda_{ikk'}
are the interaction effect parameters between \alpha_{lk}
and \alpha_{lk'}
for the three link functions, respectively; and \delta_{i12{\cdots }K^\ast}
, v_{i12{\cdots}K^\ast}
,
and \lambda_{i12{\cdots}K^\ast}
are the interaction effect parameters of \alpha_{l1}{\cdots}\alpha_{lK^\ast}
for the three link functions, respectively. It can be observed that when the logit link is adopted, the
G-DINA model is equivalent to the LCDM model.
Specifically, the A-CDM can be formulated as:
P(X_{pi}=1|\boldsymbol{\alpha}_{l}) =
\delta_{i0} + \sum_{k=1}^{K^\ast}\delta_{ik}\alpha_{lk}
The rRUM, can be written as:
log \left[P(X_{pi}=1|\boldsymbol{\alpha}_{l}) \right] =
\lambda_{i0} + \sum_{k=1}^{K^\ast}\lambda_{ik}\alpha_{lk}
The item response function for the linear logistic model (LLM) can be given by:
logit\left[P(X_{pi}=1|\boldsymbol{\alpha}_{l}) \right] =
\lambda_{i0} + \sum_{k=1}^{K^\ast}\lambda_{ik}\alpha_{lk}
In the DINA model, every item is characterized by two key parameters: guessing (g) and slip (s). Within
the traditional framework of DINA model parameterization, a latent variable \eta
, specific to
examinee p
who has the attribute mastery pattern \boldsymbol{\alpha}_{l}
and responses to i
,
is defined as follows:
\eta_{li}=\prod_{k=1}^{K}\alpha_{lk}^{q_{ik}}
If examinee p
whose attribute mastery pattern is \boldsymbol{\alpha}_{l}
has acquired every attribute
required by item i, \eta_{pi}
is given a value of 1. If not, \eta_{pi}
is set to 0. The
DINA model's item response function can be concisely formulated as such:
P(X_{pi}=1|\boldsymbol{\alpha}_{l}) =
(1-s_j)^{\eta_{li}}g_j^{(1-\eta_{li})} =
\delta_{i0}+\delta_{i12{\cdots}K}\prod_{k=1}^{K^\ast}\alpha_{lk}
(1-s_j)^{\eta_{li}}g_j^{(1-\eta_{li})}
is the original expression of the DINA model, while
\delta_{i0}+\delta_{i12{\cdots}K}\prod_{k=1}^{K^\ast}\alpha_{lk}
is an equivalent form of the DINA model after adding constraints in the G-DINA model.
Here, g_j = \delta_{i0}
and 1-s_j = \delta_{i0}+\delta_{i12{\cdots}K}\prod_{k=1}^{K^\ast}\alpha_{lk}
.
In contrast to the DINA model, the DINO model suggests that an examinee can correctly respond to
an item if he/she have mastered at least one of the item's measured attributes. Additionally, like the
DINA model, the DINO model also accounts for parameters related to guessing and slipping. Therefore,
the main difference between DINO and DINA lies in their respective \eta_{li}
formulations. The
DINO model can be given by:
\eta_{li} = 1-\prod_{k=1}^{K}(1 - \alpha_{lk})^{q_{lk}}
An object of class CDM
containing the following components:
An GDINA
object gained from GDINA
package or an
list
after BM algorithm, depending on which estimation is used.
Individuals' attribute parameters calculated by EAP method
Individual's posterior probability
Individuals' marginal mastery probabilities matrix
Attribute prior weights for calculating marginalized likelihood in the last iteration
Some basic model-fit indeces, including Deviance
, npar
, AIC
, BIC
. @seealso fit
Haijiang Qin <Haijiang133@outlook.com>
de la Torre, J. (2009). DINA Model and Parameter Estimation: A Didactic. Journal of Educational and Behavioral Statistics, 34(1), 115-130. DOI: 10.3102/1076998607309474.
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333-353. DOI: 10.1007/BF02295640.
de la Torre, J. (2011). The Generalized DINA Model Framework. Psychometrika, 76(2), 179-199. DOI: 10.1007/s11336-011-9207-7.
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26(4), 301-323. DOI: 10.1111/j.1745-3984.1989.tb00336.x.
Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign.
Henson, R. A., Templin, J. L., & Willse, J. T. (2008). Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables. Psychometrika, 74(2), 191-210. DOI: 10.1007/s11336-008-9089-5.
Huebner, A., & Wang, C. (2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71, 407-419. DOI: 10.1177/0013164410388832.
Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272. DOI: 10.1177/01466210122032064.
Ma, W., & Jiang, Z. (2020). Estimating Cognitive Diagnosis Models in Small Samples: Bayes Modal Estimation and Monotonic Constraints. Applied Psychological Measurement, 45(2), 95-111. DOI: 10.1177/0146621620977681.
Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological methods, 11(3), 287-305. DOI: 10.1037/1082-989X.11.3.287.
Tu, D., Chiu, J., Ma, W., Wang, D., Cai, Y., & Ouyang, X. (2022). A multiple logistic regression-based (MLR-B) Q-matrix validation method for cognitive diagnosis models: A confirmatory approach. Behavior Research Methods. DOI: 10.3758/s13428-022-01880-x.
validation
.
################################################################
# Example 1 #
# fit using MMLE/EM to fit the GDINA models #
################################################################
set.seed(123)
library(Qval)
## generate Q-matrix and data to fit
K <- 3
I <- 30
example.Q <- sim.Q(K, I)
IQ <- list(
P0 = runif(I, 0.0, 0.2),
P1 = runif(I, 0.8, 1.0)
)
example.data <- sim.data(Q = example.Q, N = 500, IQ = IQ,
model = "GDINA", distribute = "horder")
## using MMLE/EM to fit GDINA model
example.CDM.obj <- CDM(example.data$dat, example.Q, model = "GDINA",
method = "EM", maxitr = 2000, verbose = 1)
################################################################
# Example 2 #
# fit using MMLE/BM to fit the DINA #
################################################################
set.seed(123)
library(Qval)
## generate Q-matrix and data to fit
K <- 5
I <- 30
example.Q <- sim.Q(K, I)
IQ <- list(
P0 = runif(I, 0.0, 0.2),
P1 = runif(I, 0.8, 1.0)
)
example.data <- sim.data(Q = example.Q, N = 500, IQ = IQ,
model = "DINA", distribute = "horder")
## using MMLE/BM to fit GDINA model
example.CDM.obj <- CDM(example.data$dat, example.Q, model = "GDINA",
method = "BM", maxitr = 1000, verbose = 2)
################################################################
# Example 3 #
# fit using MMLE/EM to fit the ACDM #
################################################################
set.seed(123)
library(Qval)
## generate Q-matrix and data to fit
K <- 5
I <- 30
example.Q <- sim.Q(K, I)
IQ <- list(
P0 = runif(I, 0.0, 0.2),
P1 = runif(I, 0.8, 1.0)
)
example.data <- sim.data(Q = example.Q, N = 500, IQ = IQ,
model = "ACDM", distribute = "horder")
## using MMLE/EM to fit GDINA model
example.CDM.obj <- CDM(example.data$dat, example.Q, model = "ACDM",
method = "EM", maxitr = 2000, verbose = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.